The problem with p-values (again)

I just saw a great quote that reminds me of the post on p-values I wrote a few days ago.

To get it, you just need a little back ground (in case you didn’t read the earlier post).

With a classical hypothesis test, you have a null model that gives you a distribution of outcomes, and to test the hypothesis you make an observation, say \hat{x}, and you then consider how likely it is to get a value as high or higher as \hat{x} under this null distribution.

The p-value of \hat{x} is the probability, under the null distribution, of observing anything higher than \hat{x} and if this probability is lower than some pre-chosen threshold \alpha, you reject the null hypothesis.  Not that you ever observe any outcome higher than \hat{x}, you just reject the hypothesis if it is unlikely to observe anything higher than \hat{x}.

The hypothesis is rejected based on how much probability it puts on values larger than \hat{x}, not on any values actually observed, higher than \hat{x}.

What the use of P implies, therefore, is that a hypothesis that may be true may be rejected because it has not predicted observable results that have not occurred.

Jeffreys

44-67=-23

Tags:

Leave a Reply