Featuring some of my greatest scientific idols, including Feynman and Sagan

# Month: February 2011

## Oh no…

Is blogs really science journalism? Maybe not, but the science news *I* get, I mainly get from great blogs, so even if it isn’t “proper” science journalism, I think I can say that it is where at least some scientists get their news.

So it is a bit strange when a press officer doesn’t want to talk to bloggers.

- Would you tell a blogger, “I think this is all you need for a blog?”
- In defence of science blogs (yet again)

I know that I would absolutely *love* to have my research mentioned by Ed Young — I’ve been reading his Not Exactly Rocket Science for years — so if a press officer at AU didn’t give him all the information he wanted, I’d be rather miffed.

## Virgins

I’m watching the latest episode of *Supernatural*, titled *Like a virgin*, and something strikes me. In all other aspects of life, experience is a good thing. Howcome, when it comes to sex, not knowing what the hell you are doing is a *virtue?*

Humans are crazy, it is time to give the bonobos a chance at the top of the food chain.

Completely beside the point, naming a website *Virgen scan* for virus gene scans is a bad idea. You won’t get the right kind of visitors. I know.

Loved this exchange in the episode of *Supernatural:*

Dean: So what can you tell me about dragons.

Bobby: They are not like the Loch Ness monster, dragons aren’t real!

Dean: Can you make a few calls?

Bobby: To where? Hogwarts?

## Go Bayesian

By “doing data analysis in a patternless way,” I meant statistical methods such as least squares, maximum likelihood, etc., that estimate parameters independently without recognizing the constraints and relationships between them. If you estimate each study on its own, without reference to all the other work being done in the same field, then you’re depriving yourself of a lot of information and inviting noisy estimates and, in particular, overestimates of small effects.

Couldn’t agree more.

See also: ESP paper rekindles discussion about statistics

Not directly related, but quite relevant.

If you are interested in statistics, I really recommend Andrew Gelman’s books and blog.

## P-values again again

For no apparent good reason, I read an old post on p-values and re-read this comment:

John Larkin Says:

Hi.sorry. I have trouble with the “if you repeat experiment lots of times…p value…uniformly distributed between 0 and 1″.

Is that true? If you do it lots of times do you get as many grouped around 0.0-0.01 as around 0.49-0.50?

It may be because I’m thinking of “experiments” (e.g. height of groups)…vs some statistical scenario whish uses the word stochastic – which clearly puts me in trouble.

I don’t think of pvalue as direct measure of likelihood of nul hypothesis. But if you compared two big samples (huge!) from two big groups twice (say, of height) and each experiment gave you a p-value of 0.99….I just get the feeling that these two groups might be very similar/same population…..

Cheers

JL

My answer was this

Thomas Mailund Says:

John: Yes, p-values are uniformly distributed (under the null distribution) so you do expect to observe as many in the interval 0.0-0.01 as in 0.49-0.5.

You cannot consider a p-value of 0.99 as any kind of measure of similarity. It just doesn’t work that way.

The reason we are interested in low p-values is because if we sample from a mixture of the null distribution and the alternative distribution, then we expect more of the alternatives in the low end of p-values than we expect from the null.

Hope that helps.

Now that I think about it, this isn’t strictly true.

I still hold that p-values are uniformly distributed under the null model. So under the null model, you cannot conclude that a high p-value indicates strong support for the null model whereas low p-values support the alternative model. It doesn’t work like that.

But of course, the null model can be wrong in more than one way, and not all will show up as low p-values.

If your null model tells you that there should be a certain variance, and you see less, then you will probably see an excess of high p-values. The observations are more similar than they should be (under the null model).

You won’t see the problem as too many low p-values, but as too many high values.

If the p-values are not uniformly distributed, your null model is wrong. It can be wrong in so many ways that it really doesn’t matter why it is wrong. It is just wrong.

Hope that makes sense.