28 Jan


Imagine yourself standing on a curve of progress. You look back on the years that went before you. There were wrong turns, periods of stagnation, conflicting roads of progress. Still, there was progress.

Imagine this progress being exponential. As the progress of human culture and technology certainly was.

You look back far. Into the palaeolithic. Where earlier forms of man roamed the earth and where early Homo sapiens evolved. Slowly stone technology evolved. We do not know how much non-stone technology evolved, we do not have any evidence left of such technology because it didn’t make it through time to us. But we look back and see the slow evolution of stone technology.

Looking at time closer to the present we see cities being built. First perhaps just as meeting grounds for ceremony but as agriculture gets invented we see the density of people growing in isolated places. We see them inventing and sharing technology.

We see historical time unfolding. We see civilisation grow. We see the density of humans grow; the density of minds communicating grow; the inventions of humans spreading between minds. Where earlier an invention would sparkle into existence and soon be lost because no mind would be there to catch it and give the flame the fuel it needed to grow, now there is always adequate fuel. Now ideas spread. Inventions spread. The technology of the species grow.

Sure, from time to time, inventions are lost. Through disasters we loose civilisations. Through the idiocy that is also part of our species we actively destroy ideas. But still, minds get closer together. Ideas spread. Ideas are not lost at the rate they used to be. Rather, they are caught. They are seeds that grow in new fertile minds.

Through the classical civilisations. Through the middle ages (that, no, were not the dead waters of ideas that history books tells you). Through the enlightenment. We see ideas growing. Improving. We see civilisation and technology and our understanding of the world improving. Growing.

We look back and see a very slow progress of technology through the palaeolithic. About ten thousand years ago we see a slight jump but more of a speed up in the progress of technology through the neolithic. Progress speeds up when we see the copper age and the bronze age. Writing gets invented and now minds communicate, not just between contemporaries but through time.

Look closer to yourself in time. Communication is now global. We are still faulty mammals. We make mistakes. But good ideas spread. Inventions quickly spread across the world. It is not a certainty, of course, that good ideas spread and bad ideas die. Not at all. But ideas that improve our lives do tend to survive and ideas that hurt us tend to be thrown away. Over time. It is not a constant progression of progress, but still, we do see, in the long view, an increase in knowledge, in understanding of the universe, and in technology that lets us control and manipulate the world around us. We control our environment; we are not controlled by it.

Imagine yourself, if you can, standing on this curve of exponential progress, looking back in time. Now turn around. A tiny step in time looking back contained so much more progress than a giant step just a little further back in time. Imagine what a tiny step into the future would mean.

19 Aug

Strong selective sweeps on the X chromosome of the human-chimpanzee ancestor

We just published a new paper a few days ago title Strong Selective Sweeps on the X Chromosome in the Human-Chimpanzee Ancestor Explain Its Low Divergence


The human and chimpanzee X chromosomes are less divergent than expected based on autosomal divergence. We study incomplete lineage sorting patterns between humans, chimpanzees and gorillas to show that this low divergence can be entirely explained by megabase-sized regions comprising one-third of the X chromosome, where polymorphism in the human-chimpanzee ancestral species was severely reduced. We show that background selection can explain at most 10% of this reduction of diversity in the ancestor. Instead, we show that several strong selective sweeps in the ancestral species can explain it. We also report evidence of population specific sweeps in extant humans that overlap the regions of low diversity in the ancestral species. These regions further correspond to chromosomal sections shown to be devoid of Neanderthal introgression into modern humans. This suggests that the same X-linked regions that undergo selective sweeps are among the first to form reproductive barriers between diverging species. We hypothesize that meiotic drive is the underlying mechanism causing these two observations.

We’ve been working on the differences between the autosomes and the X chromosome for a little while now, but this work actually started earlier than this, while we were working on the gorilla genome analysis.

There was an interesting observation made by Patterson et al. (2006) that the divergence between humans and chimpanzees is much lower on the X chromosome than on the autosomes, even taking into account the smaller effective population size on the X chromosome in the common ancestor. They interpreted this, combined with seeing a large variation on coalescence times on the autosomes, as suggestive of strong population structure in the ancestral species, in particular as suggestive that the human ancestor was admixed between a lineage closely related to the chimpanzee ancestor and another lineage that diverged earlier from the chimpanzee.

Most of the patterns they observed could also be explained by just having a very large effective population size in the ancestral species, but the difference in divergence between the autosomes and the X chromosome was still a puzzle.

The X divergence is simply too small compared to the expectation from just having an effective population size of 3/4 on the X chromosome compared to the autosomes (there are only 3/4 as many X chromosomes as autosomes in a population (assuming the same number of males and females) and while different breeding success in males and females is expected to deviate from the 3/4 somewhat the divergence is really very small for that to be the explanation — especially considering that we don’t see that large differences elsewhere in great apes).

While working on the gorilla genome paper we used our coalescent hidden Markov model approach to examine this. The CoalHMM lets us look at incomplete lineage sorting along the genome which is a strong proxy for the diversity in the ancestral species. And while we clearly saw the reduced divergence on the X chromosome between humans and chimpanzees we also clearly saw a reduction in ILS — something also observed in the Patterson et al. paper — suggesting that the explanation is not a lower mutation rate but reduced diversity in the ancestral species.

Interestingly, we observed that this reduction in ILS was not uniform along the X chromosome. This is what we have studied in greater detail in the new paper.

What we see is that the ancestral diversity on the X chromosome looks bimodal. On some parts of the X chromosome we see the 3/4 diversity we would expect and in other parts of the chromsome it is much reduced compared to that.

Selection is expected to reduce diversity. Purifying selection removes diversity because deleterious mutations are purged from the population and positive selection creates a sweep where part of the genome has coalesced very recently.

Very wide regions with low diversity is hard to create by purifying selection — it depends a little on the math and the assumptions that goes into the models, but generally we don’t expect it based on the models we have and we don’t see it when we try to simulate the effect of such selection.

Positive selection — selective sweeps — can create it, but very wide regions as those we observe require either very strong sweeps or recurrent sweeps.

Whatever creates it, it is clearly there, and that is a bit of a puzzle.

We couldn’t really come up with any explanation for the pattern we see except for strong selection. That doesn’t mean that it is selection, of course, but we are completely blank on what it could be if it isn’t.

We don’t really have an explanation for it in the paper. We don’t really have an explanation, period, actually. We do provide a hypothesis for something it could be, but we don’t have any strong evidence for it, so it is something we are still working on figuring out.

Our results do not rule out the complex speciation scenario suggested by Patterson et al. Hybrid incompatibility could be the selection that we are observing, and very interestingly the regions we see the reduced ancestral divergence in are exactly the regions where we see no Neanderthal introgression in non-African modern humans which suggests it could be the same mechanism we see in the human-chimpanzee ancestor.

However, we also see reduced diversity in extant great apes in the same regions so unless there is hybridisation and barriers against gene flow in these regions everywhere in the great ape phylogeny, there is something more complicated — and more interesting — going on.

We are now trying to figure out what that could be.

24 Jun

PubChase and Papers

I wanted to try out pubchase since it sounds like a great way of getting updates on papers I would want to read. I have uploaded my Mendeley library there, but it will take a few days before it can give me recommendations, apparently. I look forward to seeing how it works.

I have been pretty happy with how ReadCube performs at finding recommendations, but while I love the Enhanced PDF feature in ReadCube I still prefer to use Papers for managing my papers and citations (mainly because I’ve had a lot of trouble using ReadCube for citations in a paper). Papers doesn’t do recommendations, though, and doesn’t have features that lets you track who have cited a paper and things like that. It’s been a long while since I used Mendeley.

I have set up a workflow for automatically importing papers I download in ReadCube into Papers. Is there any easy way of automatically sending papers from pubchase to Papers?

22 Jun

Admixture graph R package

The last couple of months I have worked on and off on an R package for modelling and testing admixture graphs.

You can download it from github or install it directly in R using:

I know, the github repository has an underscore and the package name does not. R packages can’t have an underscore in their name and I didn’t think about it when I made the repository, so that is how it is right now.

Building admixture graphs

I’m using the package in a couple of projects right now where I’m using it to fit graphs to data. The data I work with is D statistics — I don’t compute those in the package but use ADMIXTOOLS — and I use the package to extract equations for the expected values of these statistics and for fitting graph parameters (edge lengths and admixture proportions) to the data.

It is similar to the pqGraph tool from ADMIXTOOLS (that I have never managed to run) except that I don’t compute error bars on parameters yet. I still have to find a good way of doing that. I have some ideas, but it is a bit more complicated than you might think.

Anyway, the code for specifying graphs is a bit crude but pretty straightforward. The code below builds and plots a graph.

Admixture graph

Admixture graph

Fitting graphs

With a data frame with columns W, X, Y, Z, and D (the first four should be samples and D is then the D(W,X;Y,Z) statistics) you can then fit a graph to the data.

The interface works with magrittr or dplyr pipelines so you can write something like

to fit the graph parameters and plot the fit.


Fitted data

Fitted data

You can also extract the fitted parameters from the result of fit_graph() using the coefficients() function, get the fitted values using the fitted() function, and in general use the usual interface for fitted models in R.

Except for confidence intervals with confint(). As I wrote above, I haven’t quite figured out how to do that yet.

It is not terribly solid code yet — it is more likely to crash with a meaningless error message than a meaningful one — but I am working on improving that. If anyone can find a use for it, and give some feedback, I would much appreciate it.

22 Jun

An early modern human from Romania with a recent Neanderthal ancestor

New interesting paper out: An early modern human from Romania with a recent Neanderthal ancestor Fu et al.

Neanderthals are thought to have disappeared in Europe approximately 39,000–41,000 years ago but they have contributed 1–3% of the DNA of present-day people in Eurasia. Here we analyse DNA from a 37,000–42,000-year-old modern human from Pestera cu Oase, Romania. Although the specimen contains small amounts of human DNA, we use an enrichment strategy to isolate sites that are informative about its relationship to Neanderthals and present-day humans. We find that on the order of 6–9% of the genome of the Oase individual is derived from Neanderthals, more than any other modern human sequenced to date. Three chromosomal segments of Neanderthal ancestry are over 50 centimorgans in size, indicating that this individual had a Neanderthal ancestor as recently as four to six generations back. However, the Oase individual does not share more alleles with later Europeans than with East Asians, suggesting that the Oase population did not contribute substantially to later humans in Europe.

I actually heard about this back in May in Cold Spring Harbor, but it is great to be able to actually read the paper.

What especially excites me about this paper is that it is hinting at admixture between Neanderthals and modern humans was pretty common. For some values of “pretty” and “common” at least. What we know so far about Neanderthal admixture is that all non-Africans have a little but that Asians (and Native Americans) have a little more than Europeans — somewhat surprising — and that this most likely is because of a second admixture into Asians.

There was a suggestion that the higher level of Neanderthals in Asians was caused by less negative selection removing it from Asians but a couple of papers argues against that (see e.g. Vernon & Akey 2015).

So there was the admixture event ancestral to all Eurasians, another into Asians, and now a third admixture event.

This individual — and I guess this admixture event — didn’t leave much (if any) genes in extant populations, so it isn’t part of the admixture that we see the results of today, but it was an admixture event between modern humans and Neanderthals nevertheless.

There is a rule of thumb that says: “zero, one, or many” — either something never happens, it is rare enough that it only happens once, or it happens a lot. It is not a strong rule, but now that we have evidence for at least three admixture events it seems hard to argue that interbreeding between modern humans and Neanderthals was rare.

The strange thing now is how uniform the level of Neanderthals is in Eurasians after all. I think I would expect some geographic differences, but maybe such differences have just been evened out by later migration. I don’t know. I don’t really have a strong intuition for that.

Vernot, B., & Akey, J. M. (2015). Complex History of Admixture between Modern Humans and Neandertals. American Journal of Human Genetics.