I guess Neanderthals weren’t stupid after all…

Actually, I have no particular reason to believe that Neanderthals should be dumber than Homo sapience. Well, they were around for thousands of years with very little if any development in the kind of tools they used (or left behind, in any case) while we have managed to improve on tool design quite frequently.

At least lately, I don’t know how fast we were innovating in the stone age.

In any case, what I wanted to go with this is to this page that describes a study of Neanderthal and Homo sapience tools and conclude that they were equally efficient.

Does this mean that Neanderthals were as smart as us?  I guess it doesn’t necessarily, but at least they were as technically advanced.

How did we outcompete them, exactly, and why are they gone?

Replicating haplotype findings

I have a small problem.

We have analysed some cancer data from DeCODE as part of the association mapping project PolyGene. We used Blossoc for this and we found some candidate regions worth examining further.

We have access to samples from Spain and the Netherlands, and we want to try to replicate the findings there. Now the problem is how to choose a strategy for replication.

Blossoc is a haplotype method that tries to infer the local genealogy in a region and then examines the clustering of phenotypes on this genealogy. The problem with such an approach is that you really need an entire region to replicate to try to do the same trick in the replication population. This means typing a lot of markers in the replication sample (expensive) and potentially correcting for a lot of tests (reducing power). It is not really the way to go.

We extended Blossoc to output what it considers the most important SNPs in the genealogy inference in each interesting region. This should contain the most important SNPs in the regions for the replication, and gave us 2-6 SNPs per candidate region (with only 43 SNPs all in all for three diseases, so not a small reduction).

We have typed these SNPs in the replication population, but now we need to figure out how to try to replicate the findings with only that.

It goes without saying that we need to decide exactly what to test for based on the original data. If we start searching for significant signals in the new data we are no longer replicating but data trawling and the risk of false positives drastically increases.

I have a program for listing all haplotype patterns in a data set and testing them for association, and I can run that on the old data to pick the patterns to test for in the new data.  There is a tradeoff, though, between association scores and the complexity of the pattern.  There is bound to be some overfitting in the old data, and we want to avoid that in the patterns to replicate.

It is a tricky problem…


I was supposed to be in Iceland this week, visiting DeCODE.  I’m not.  Late last week I got a toothache. I’ve been to both the dentist and the doctor, but they couldn’t figure out why I had it, so they’ve just put me on painkillers to see if it disappears by itself.  Pure symptom treatment.

I love painkillers.  It is a great invention.  For me, right now, they only work some of the time, but it is a lot better than nothing.

Still, I would much prefer treating the actual cause, but as long as that is unknown there is no choice.

It sucks, though, to be stuck at home when I should be analysing data at DeCODE.