OK. You completely misunderstood what I was saying. It will completely invalidate the testing utility I'm making if people use their currently existing data. Why? Imagine this.
You post in the newspaper that you'd like to do a study into how likely it is to be hit by lighting. Not surprisingly, the people that answer your add are those most concerned about this issue (AKA people that have been hit). After looking at all your volunteer test subjects you conclude that the chances of being hit by lighting are 1 in 1.
Problem: All the people that weren't hit by lighting, didn't volunteer.
Solution: Take the volunteers, but toss out all that has happened to them in their lives before they signed up for your study. Dismiss their preexisting data, and collect new data from this point on.
The rule of thumb with statistical tests is never to use the data that made you want to do the test. Test forward from the point in time you decide to do the test and dismiss what's gone before.
All data by definition is past data. The past I'm talking about here, that should be ignored, is what's happened before you decided to do the test.
~FK
--- In vpFREE@yahoogroups.com, "cdfsrule" <cdfsrule@...> wrote:
>
> I know I am taking this quote out of context (sorry FK), but your statement:
>
> --- In vpFREE@yahoogroups.com, "Frank" <frank@> wrote:
> >
> >Statistical test cannot be used on anything that's already happened, or else one opens the door for selective recruitment and confirmation bias.
> >
> > ~FK
> >
>
> is absolutely not true. In fact, statistical tests can only be used on "data"-- that is on stuff that already has been observed, computed, recorded, etc. In fact, statistical tests are used in determining (in the sense of ascribing a probability to) if there is or was bias, selective recruitment, etc. of events (and associated data) that has already occured.
>
> Take a look at: http://en.wikipedia.org/wiki/Statistical_hypothesis_testing
>
[vpFREE] Re: how to tell if your machine is fair?
__._,_.___
.
__,_._,___