Classification tree for detection of single-nucleotide polymorphism (SNP)-by-SNP interactions related to heart disease: Framingham Heart Study
نویسندگان
چکیده
The aim of this study was to detect the effect of interactions between single-nucleotide polymorphisms (SNPs) on incidence of heart diseases. For this purpose, 2912 subjects with 350,160 SNPs from the Framingham Heart Study (FHS) were analyzed. PLINK was used to control quality and to select the 10,000 most significant SNPs. A classification tree algorithm, Generalized, Unbiased, Interaction Detection and Estimation (GUIDE), was employed to build a classification tree to detect SNP-by-SNP interactions for the selected 10 k SNPs. The classes generated by GUIDE were reexamined by a generalized estimating equations (GEE) model with the empirical variance after accounting for potential familial correlation. Overall, 17 classes were generated based on the splitting criteria in GUIDE. The prevalence of coronary heart disease (CHD) in class 16 (determined by SNPs rs1894035, rs7955732, rs2212596, and rs1417507) was the lowest (0.23%). Compared to class 16, all other classes except for class 288 (prevalence of 1.2%) had a significantly greater risk when analyzed using GEE model. This suggests the interactions of SNPs on these node paths are significant.
منابع مشابه
Single Nucleotide Polymorphism (SNP) in the Adiponectin Gene and Cardiovascular Disease
Dear Editor, The recent article by Mohammadzadeh et al.[1] on the latest issue of this Journal showed that the T allele +276G/T SNP of ADIPOQ gene is more associated with the increasing risk of coronary artery disease (CAD) in subjects with type 2 diabetes. Adipocytes were described in myocardial tissue of CAD patients and their role recently discussed[2,3]. Susceptibility to CAD by polymorp...
متن کاملDetecting gene-by-smoking interactions in a genome-wide association study of early-onset coronary heart disease using random forests
BACKGROUND Genome-wide association studies are often limited in their ability to attain their full potential due to the sheer volume of information created. We sought to use the random forest algorithm to identify single-nucleotide polymorphisms (SNPs) that may be involved in gene-by-smoking interactions related to the early-onset of coronary heart disease. METHODS Using data from the Framing...
متن کاملمطالعات درخت تصمیم در برآورد ریسک ابتلا به سرطان سینه با استفاده از چند شکلیهای تک نوکلوئیدی
Abstract Introduction: Decision tree is the data mining tools to collect, accurate prediction and sift information from massive amounts of data that are used widely in the field of computational biology and bioinformatics. In bioinformatics can be predict on diseases, including breast cancer. The use of genomic data including single nucleotide polymorphisms is a very important ...
متن کاملSingle Nucleotide Polymorphisms and Association Studies: A Few Critical Points
Uncovering DNA sequence variations that correlate with phenotypic changes, e.g., diseases, is the aim of sequence variation studies. Common types sequence variations are Single nucleotide polymorphism (SNP, pronounced snip).SNPs are the third-generation molecular marker. SNP represents a DNA sequence variant of a single base pair with the minor allele occurring in more than 1% of a given popula...
متن کاملApplication of single-nucleotide polymorphism (SNP) as a molecular marker in the study of genetic diversity of aquatic populations
Genetic diversity is one of the important and essential characteristics of any population for its survival. The study of genetic variation in different populations of aquatic organisms is of particular importance in order to protect, stabilize and manage their stocks. Based on studies conducted in recent years, molecular markers have proven that they can be used as indicators of the genetic div...
متن کامل