Statistical power and interacting genes
Sunday, March 23rd, 2008
Earlier this week we discussed the paper below in our association mapping journal club. Lately we have been interested in epistasis (gene-gene interaction) in the context of association mapping -- we have just submitted a paper on the subject and have a few projects in the pipeline working on this problem -- and one problem that concerns us is the power of detecting gene-gene interaction in association mapping. This paper turned out not to really be about that, but it was interesting nonetheless.
Anyway, back to the paper:
Power of genome-wide association studies in the presence of interacting loci
Joseph Pickrell, Françoise Clerget-Darpoux, Catherine Bourgain
Genetic Epidemiology 31(7) 748 - 762Abstract
Though multiple interacting loci are likely involved in the etiology of complex diseases, early genome-wide association studies (GWAS) have depended on the detection of the marginal effects of each locus. Here, we evaluate the power of GWAS in the presence of two linked and potentially associated causal loci for several models of interaction between them and find that interacting loci may give rise to marginal relative risks that are not generally considered in a one-locus model. To derive power under realistic situations, we use empirical data generated by the HapMap ENCODE project for both allele frequencies and linkage disequilibrium (LD) structure. The power is also evaluated in situations where the causal single nucleotide polymorphisms (SNPs) may not be genotyped, but rather detected by proxy using a SNP in LD. A common simplification for such power computations assumes that the sample size necessary to detect the effect at the tSNP is the sample size necessary to detect the causal locus directly divided by the LD measure r2 between the two. This assumption, which we call the proportionality assumption, is a simplification of the many factors that contribute to the strength of association at a marker, and has recently been criticized as unreasonable (Terwilliger and Hiekkalinna [2006] Eur J Hum Genet 14(4):426-437), in particular in the presence of interacting and associated loci. We find that this assumption does not introduce much error in single locus models of disease, but may do so in so in certain two-locus models.
The problem considered in the paper is the following: If we are searching for gene-disease association and the disease risk depends on an interaction between two variants, will we be able to detect it? I'm simplifying a bit here, but that is the essential question.
Testing single markers
The typical approach for finding genes that affect the disease risk, when analysing the entire genome in any case, is to go through each typed variant and test if the cases and controls have different distributions of genotype frequencies. I've described this in a bit more detail in an earler post, so I won't say much more on that here.
The power to detect an association when it is there, depend on several parameters, such as the allele frequencies, the sample size, and of course the strength of the effect the genotype has on the disease risk, typically measured by the genetic relative risk GRR. For a binary marker (what we typically consider), we can consider the risk of allele aa the "basic" risk (GRRaa=1) and talk about the relative risk of Aa and AA, GRRAa and GRRAA. Different "disease models" put constraint on these, e.g. a dominant model would have GRRaa=GRRAa=1 != GRRAA, but in general there are two risks that can vary in relation to the basic risk.
Gene-gene interaction
Now, if the disease risk depends on several markers, you can have various kinds of interaction. For two markers, you now have nine genotypes, {aa,Aa,AA}x{bb,Bb,BB}, with eight GRRs that can vary in relation to GRRaabb. Again, various "classical" disease models can put constraints on the GRRs.
The problem they consider in the paper is such a pair-wise interaction setup (with four different disease models), and how the power of detecting an association depends on the GRRs, disease model, allele frequencies, etc.
Detecting an association, here, means detecting an association at A or B (or both), but not detecting the right disease model, or detecting that there is really an interaction going on; it is still considered a "hit" if only one of the two markers is found to be associated with the disease. I'll get back to that below.
The way thay go about this is to calculate the marginal GRRs, i.e. the relative risks of AA and Aa when ignoring the B marker, and the GRRs of BB and Bb when ignoring the A marker. These marginal GRRs are, of course, affected by the (interaction) disease model, GRRs of the interacting pair, frequencies, etc, but once the marginal GRRs have been calculated, the power of detection can be computed as if no interaction was going on.
Indirect testing
Typically, we do not have all the variation typed, but rely on tagSNPs to indirectly test for association. The way this works is that the SNPs are correlated (this correlation is called linkage disequilibrium, LD) so the relative risk of one SNP "leaks into" a relative risk of another SNP. The GRRs of a tagSNP depend on the LD with the causal SNP(s) and the allele frequencies and is not straight forward, but as a rule of thumb there is the following relationship: if a sample size of N is needed to detect association at the causal marker, then a sample size of N / r2 is needed at the tagSNP, where r2 is a measure of LD.
Although mathematically justified, it is only a rule of thumb, and it is violated especially in the presence of interaction (where there is potentially LD between the tagSNP and both causal SNPs, to confuse the matter).
A large part of the paper is concerned with this rule of thumb, and in my opinion this is the most interesting part of the paper. We know very little about how we perform in tagging for interaction, since essentially all tagging algorithms are based on the r2 rule of thumb.
Not really about interaction
Since they define "detection of association" to be detection of a marginal association, we are not really considering power of detecting association. For the direct testing (when we are not considering tagSNPs), the interaction doesn't really come into play at all! The interaction model determines the marginal GRR, and as such it is interesting enough, but once we have the marginal GRR, there is nothing new in how we determine the power. The greater the GRR, the greater the power, but that is completely independent of interaction or not.
For the tagging consideration it is a different matter. There the interaction has an effect, as I mentioned above, because both causal SNPs can be in LD with the tag, and that affects the r2 rule of thumb.
Still, the paper is about the power of detecting marginal association, not interaction, and it is possible (and not even that hard) to construct models where there is a strong interaction association but very little marginal effect. For such a setup, a marginal test will never be powerful, and a full interaction model must be used.
It is the latter problem we are currently working on in my group. How do we find pairs that interact but have little marginal effect? (we have just submitted a paper on that), what is the power to detect such interaction? and how well do we tag such interaction?
Pickrell, J., Clerget-Darpoux, F., Bourgain, C. (2007). Power of Genome-Wide Association Studies in the Presence of Interacting Loci. Genetic Epidemiology, 31, 748-762.
