Imputation based association mapping
Friday, February 1st, 2008A neat idea that has become quite popular the last two years is to consider untyped genotypes as “missing data” and then impute this data using panel data such as HapMap. Imputing all missing markers and then testing those is the ultimate multi-marker association mapping method: you directly test all markers and if you get a hit you immediately know which marker to try to replicate.
Well, I might be overselling it a bit here — there are some situations where imputing markers and testing them individually won’t actually help you and where other multi-marker methods will — but it is a very nice idea and the output is very easy to interpret.
Unfortunately, imputation methods can be pretty slow. We’ve used FastPHASE in our projects, and while it works fine for smaller regions, it is too computationally intensive for whole genome imputations (at least with the computers we have access to).
In this issue of Bioinformatics there’s an application note describing a new tool for doing imputation based association mapping:
Association studies for untyped markers with TUNA
Xiaoquan Wen and Dan L. Nicolae
Bioinformatics 2008 24(3):435-437; doi:10.1093/bioinformatics/btm603
Rather than imputing the actual markers, they impute frequencies of the missing markers in the cases and the controls and that significantly improves both the running time and the memory usage.
Getting only the frequencies will not help us using multi-marker methods on imputed data, but for single marker tests (at least tests that only use the frequencies) I imagine it could be a very useful tool.
Citation for Research Blogging:
Wen, X., Nicolae, D.L. (2008). Association studies for untyped markers with TUNA. Bioinformatics, 24(3), 435-437. DOI: 10.1093/bioinformatics/btm603