A Robust Statistical Method for Association-Based eQTL Analysis
نویسندگان
چکیده
BACKGROUND It has been well established that theoretical kernel for recently surging genome-wide association study (GWAS) is statistical inference of linkage disequilibrium (LD) between a tested genetic marker and a putative locus affecting a disease trait. However, LD analysis is vulnerable to several confounding factors of which population stratification is the most prominent. Whilst many methods have been proposed to correct for the influence either through predicting the structure parameters or correcting inflation in the test statistic due to the stratification, these may not be feasible or may impose further statistical problems in practical implementation. METHODOLOGY We propose here a novel statistical method to control spurious LD in GWAS from population structure by incorporating a control marker into testing for significance of genetic association of a polymorphic marker with phenotypic variation of a complex trait. The method avoids the need of structure prediction which may be infeasible or inadequate in practice and accounts properly for a varying effect of population stratification on different regions of the genome under study. Utility and statistical properties of the new method were tested through an intensive computer simulation study and an association-based genome-wide mapping of expression quantitative trait loci in genetically divergent human populations. RESULTS/CONCLUSIONS The analyses show that the new method confers an improved statistical power for detecting genuine genetic association in subpopulations and an effective control of spurious associations stemmed from population structure when compared with other two popularly implemented methods in the literature of GWAS.
منابع مشابه
Methods for Population-Based eQTL Analysis in Human Genetics
Gene expression is a critical process in biological system that is influenced and modulated by many factors including genetic variation. Expression Quantitative Trait Loci (eQTL) analysis provides a powerful way to understand how genetic variants affect gene expression. For genome wide eQTL analysis, the number of genetic variants and that of genes are large and thus the search space is tremend...
متن کاملRobust Linear Models for Cis-eQTL Analysis
Expression Quantitative Trait Loci (eQTL) analysis enables characterisation of functional genetic variation influencing expression levels of individual genes. In outbread populations, including humans, eQTLs are commonly analysed using the conventional linear model, adjusting for relevant covariates, assuming an allelic dosage model and a Gaussian error term. However, gene expression data gener...
متن کاملMapping quantitative trait loci for expression abundance.
Mendelian loci that control the expression levels of transcripts are called expression quantitative trait loci (eQTL). When mapping eQTL, we often deal with thousands of expression traits simultaneously, which complicates the statistical model and data analysis. Two simple approaches may be taken in eQTL analysis: (1) individual transcript analysis in which a single expression trait is mapped a...
متن کاملIdentifying the Genetic Variation of Gene Expression Using Gene Sets: Application of Novel Gene Set eQTL Approach to PharmGKB and KEGG
Genetic variation underlying the regulation of mRNA gene expression in humans may provide key insights into the molecular mechanisms of human traits and complex diseases. Current statistical methods to map genetic variation associated with mRNA gene expression have typically applied standard linkage and/or association methods; however, when genome-wide SNP and mRNA expression data are available...
متن کاملA statistical framework for eQTL mapping using RNA-seq data.
RNA-seq may replace gene expression microarrays in the near future. Using RNA-seq, the expression of a gene can be estimated using the total number of sequence reads mapped to that gene, known as the total read count (TReC). Traditional expression quantitative trait locus (eQTL) mapping methods, such as linear regression, can be applied to TReC measurements after they are properly normalized. I...
متن کامل