
What I wanted to blog about yesterday, but didn't get around to as I explained in the previous post, was two letters in the latest version of Nature on human variation and the distribution of deleterious mutations. I'll split it in two posts; in this post I'll discuss Lohmueller et al. Genetic Future beat me to it so I suggest you also read the dicussion there. The paper is also covered in the latest Nature Podcast and commented on at Nature. For a human evolution perspective, read John Hawks' post on the topic.
Proportionally more deleterious genetic variation in European than in African populations
Lohmueller et al.
Abstract
Quantifying the number of deleterious mutations per diploid human genome is of crucial concern to both evolutionary and medical geneticists. Here we combine genome-wide polymorphism data from PCR-based exon resequencing, comparative genomic data across mammalian species, and protein structure predictions to estimate the number of functionally consequential single-nucleotide polymorphisms (SNPs) carried by each of 15 African American (AA) and 20 European American (EA) individuals. We find that AAs show significantly higher levels of nucleotide heterozygosity than do EAs for all categories of functional SNPs considered, including synonymous, non-synonymous, predicted 'benign', predicted 'possibly damaging' and predicted 'probably damaging' SNPs. This result is wholly consistent with previous work showing higher overall levels of nucleotide variation in African populations than in Europeans. EA individuals, in contrast, have significantly more genotypes homozygous for the derived allele at synonymous and non-synonymous SNPs and for the damaging allele at 'probably damaging' SNPs than AAs do. For SNPs segregating only in one population or the other, the proportion of non-synonymous SNPs is significantly higher in the EA sample (55.4%) than in the AA sample (47.0%; P < 2.3 x 10-37). We observe a similar proportional excess of SNPs that are inferred to be 'probably damaging' (15.9% in EA; 12.1% in AA; P < 3.3 x 10-11). Using extensive simulations, we show that this excess proportion of segregating damaging alleles in Europeans is probably a consequence of a bottleneck that Europeans experienced at about the time of the migration out of Africa.
In this paper, the authors compare the genetic variability in African decent and Euroean decent Americans, classify the variations according to estimated fitness, and how the "fitness" of the variations differ between the two populations.
Classifying variations and comparing the populations
Using genome-wide exon re-sequencing, the authors identified SNP variation in the sample and compared with the chimpanzee genome to infer ancestral and derived alleles. Ignoring for a bit the effect of mutations, just from knowing the variations and which alleles are ancestral and derived, we can learn about the history of the populations.
First off, we can consider the variation within the populations. Are there more variable sites in one population than in the other? Is there more heterogenity (meaning are people more likely to carry two different alleles) in one population or the other?
The results in the paper confirms previous studies that has shown that there are more variability in African than European decent individuals, matching the Out of Africa hypothesis. If humans originated in Africa -- which everything indicates and I doubt anyone disagrees with any more -- and populations outside Africa are relatively recent, then we expect the variability in Africa to be greater than outside Africa. A small population branching off a larger will only carry some of the variants with it, and it takes time for this to level out.
The SNPs can be classified in two categories: synonymous SNPs -- those that do not change the amino acid the gene codes for -- and non-synonymous -- those that do. Roughly speaking, we expect the non-synonymous mutations to have an effect on fitness but not the synonymous. This is very rough, however, since the synonymous mutations can have major effects on regulation, splicing, etc., but still...
Using bioinformatics methods, the authors classify non-synonymous mutations into deleterious and non-deleterious mutations based on protein structure and conservation. They then observe that the deleterious mutations are relatively more frequent in European decent individuals.
Why is this an expected result?
To understand why this is the case, we turn to population genetics.
We expect deleterious mutations to be removed -- or at least kept down in frequency -- by selection, but there is a certain stochasticity in this. The frequency of an allele vary somewhat randomly in a population. Offspring will inherit one allele or the other with equal probability and pass that allele off to their offspring with equal probability. With no selection acting on the allele, the frequency will shrink or grow randomly until either fixed in the population or lost completely. When selection is acting on the allele, the number of offspring will depend on the alleles an individual carry. There is still a randomness, but the distribution of the number of offspring will change, more or less, depending on the strength of the selection.
How does this explain that there are more deleterious mutations in Europeans, then? This has to do with how stochastic the process really is.
Generally in stochastic processes, when we consider small numbers the variants in the process is larger than when we consider larger numbers. For very larger numbers, a stochastic process can behave almost deterministically, while for very small numbers the process can appear completely random.
A consequence of this is that weak selection requires a large population to have any observable effect over the background randomness of the process. The weaker the selection, the larger the population needs to be for the selection to have any effect.
If a population goes through a bottleneck, as the non-African populations are thought to have done, the selection that would act on the African population would have little effect on the non-African populations. Mutations that are selected against in the African population will not have been selected against in the non-African populations, simply because the selection wasn't strong enough to have any effect in the smaller populations.
The paper finishes with a simulation study that shows that a bottleneck following the immigration out of Africa, followed by a population expansion, gives the observed pattern of variation, nicely confirming this.
Lohmueller, K.E., Indap, A.R., Schmidt, S., Boyko, A.R., Hernandez, R.D., Hubisz, M.J., Sninsky, J.J., White, T.J., Sunyaev, S.R., Nielsen, R., Clark, A.G., Bustamante, C.D. (2008). Proportionally more deleterious genetic variation in European than in African populations.
Nature, 451(7181), 994-997. DOI:
10.1038/nature06611