Artikel
The impact of haplotype uncertainty resulting from genotyping error and statistical haplotype reconstruction on association analysis
Suche in Medline nach
Autoren
Veröffentlicht: | 6. September 2007 |
---|
Gliederung
Text
Introduction: Haplotypes are of increasing interest in genetic association studies. Due to genotyping error in each SNP and statistical haplotype reconstruction, there is uncertainty in the haplotypes that can cause a bias in association studies or can deflate the power.
Material and Methods: To quantify the uncertainty in haplotypes, we conducted simulation studies comparing the reconstructed with given haplotypes, resulting in misclassification matrices and error measures such as sensitivity and specificity. Genotyping error of up to 1% per allele was assumed and incorporated into the haplotype misclassification problem. We present the impact of haplotype misclassification on association estimates assuming a dominant as well as codominant inheritance model. The MC-SIMEX approach (Misclassification Simulation Extrapolation) was applied as a practical method to correct haplotype association estimates for misclassification.
Results: In a real data scenario based on a gene with high haplotype diversity, a random genotyping error of only 0.5% per SNP, reduced the sensitivity of the reconstructed haplotypes substantially by up to 20%. However, since the specificity was not greatly affected, the overall error rate was moderate. Partially, the influence of the genotyping error, accumulating across the number of SNPs, on the overall misclassification was larger than the pure haplotype reconstruction error. Thus, haplotype reconstruction error alone affected association estimates only slightly. If genotyping error can be expected, their impact on haplotype association estimates in the analysis is more remarkable. In simulations with known misclassification, the MC-SIMEX method showed up as a feasible approach to correct haplotype association estimates.
Conclusion: In haplotype association analysis the computation of sensitivity, specificity and the misclassification matrix of potential risk haplotypes is worthwhile. This information can then be used to correct naïve estimates with existing statistical methods and thus enhance the understanding of haplotype risk estimates.