How much selection is going on in humans?
A priori we expect that most mutations, by far, have no consequence on fitness, while some have a negative effect and very few have a positive effect. Consequently, we can generally ignore selection when analysing genomic sequences.
However, over the last few years a number of papers have suggested that adaptive (positive) selection has played a major role in shaping the human genome. That is, genome-wide there are signals that shows patterns of selection. So perhaps we shouldn't be so quick to ignore it.
Yesterday in PLoS Genetics there is another paper arguing this:
Cai et al. PLoS Genetics 5(1)
Much effort and interest have focused on assessing the importance of natural selection, particularly positive natural selection, in shaping the human genome. Although scans for positive selection have identified candidate loci that may be associated with positive selection in humans, such scans do not indicate whether adaptation is frequent in general in humans. Studies based on the reasoning of the MacDonald–Kreitman test, which, in principle, can be used to evaluate the extent of positive selection, suggested that adaptation is detectable in the human genome but that it is less common than in Drosophila or Escherichia coli. Both positive and purifying natural selection at functional sites should affect levels and patterns of polymorphism at linked nonfunctional sites. Here, we search for these effects by analyzing patterns of neutral polymorphism in humans in relation to the rates of recombination, functional density, and functional divergence with chimpanzees. We find that the levels of neutral polymorphism are lower in the regions of lower recombination and in the regions of higher functional density or divergence. These correlations persist after controlling for the variation in GC content, density of simple repeats, selective constraint, mutation rate, and depth of sequencing coverage. We argue that these results are most plausibly explained by the effects of natural selection at functional sites—either recurrent selective sweeps or background selection—on the levels of linked neutral polymorphism. Natural selection at both coding and regulatory sites appears to affect linked neutral polymorphism, reducing neutral polymorphism by 6% genome-wide and by 11% in the gene-rich half of the human genome. These findings suggest that the effects of natural selection at linked sites cannot be ignored in the study of neutral human polymorphism.
Selection and variation
Neutral mutations are expected to behave differently from non-neutral mutations mainly in their chance to get fixed and the time it takes them to get fixed in a population. Neutral mutations that gets fixed are expected to have taken a number of generations linear in the effective population size, while mutants under selection that gets fixed are expected to have taken a logarithmic number of generations. That goes for both positive and negative selection, but for different reasons.
If we consider a region and assume that there is no recombination going on, and we assume that a new mutation appears here destined to get fixed in the population. When it gets fixed, all individuals in the population will be descendent from the individual that first carried the mutation. They will not be identical at the region, though, 'cause new mutations will have accumulated in the time it took the mutation to get fixed.
The amount of variation in the population will depend on how quickly the mutation got fixed. If it happened very slow, we expect much variation, and if it happened very rapidly, we expect little variation.
That, combined with the expected time to fixation for neutral and selected mutants gives us a pattern to look for that distinguishes between neutral evolution and selection.
Variation, selection and recombination
When recombination is going on, we expect a slightly different pattern.
If there is selection on several mutations in the region, it gets pretty complicated. At least I haven't managed to quite get my head around the details yet, but I'll refer you to this book: Population Genetics of Multiple Loci by Freddy Christiansen.
I will just assume that there is a single mutation under selection.
In that case, the pattern really is very similar. We don't expect to see reduced variation in the entire region around the mutation, but instead we expect reduced variation close to the mutation site -- where few recombination events have occurred while the mutation got fixed -- and increased variation up to the neutral level as we move away from the mutation site -- where more and more recombinations have uncoupled the mutant from sites further away.
Selection in humans
It is this kind of patterns they look for in the PLoS Genetics paper.
A consequence of all of the above is that, assuming lots of selection is going on, we expect a positive correlation between variation and recombination sites, and a negative correlation between variation and sites we a priori expect to be functional (like genes).
This is exactly what they find.
There are a few more details to it, of course. Density of functional sites, recombination and mutation rates are not independent, so we could see exactly the same pattern just from the correlation with neutral mutation rates, so they need to correct for this.
Essentially, though, it is these patterns -- expected assuming selection but not assuming neutrality -- that they find.
James J. Cai, J. Michael Macpherson, Guy Sella, Dmitri A. Petrov (2009). Pervasive Hitchhiking at Coding and Regulatory Sites in Humans PLoS Genetics, 5 (1) DOI: 10.1371/journal.pgen.1000336