Semantic prioritization of novel causative genomic variants
نویسندگان
چکیده
Discriminating the causative disease variant(s) for individuals with inherited or de novo mutations presents one of the main challenges faced by the clinical genetics community today. Computational approaches for variant prioritization include machine learning methods utilizing a large number of features, including molecular information, interaction networks, or phenotypes. Here, we demonstrate the PhenomeNET Variant Predictor (PVP) system that exploits semantic technologies and automated reasoning over genotype-phenotype relations to filter and prioritize variants in whole exome and whole genome sequencing datasets. We demonstrate the performance of PVP in identifying causative variants on a large number of synthetic whole exome and whole genome sequences, covering a wide range of diseases and syndromes. In a retrospective study, we further illustrate the application of PVP for the interpretation of whole exome sequencing data in patients suffering from congenital hypothyroidism. We find that PVP accurately identifies causative variants in whole exome and whole genome sequencing datasets and provides a powerful resource for the discovery of causal variants.
منابع مشابه
Integrating Multiple Genomic Data to Predict Disease-Causing Nonsynonymous Single Nucleotide Variants in Exome Sequencing Studies
Exome sequencing has been widely used in detecting pathogenic nonsynonymous single nucleotide variants (SNVs) for human inherited diseases. However, traditional statistical genetics methods are ineffective in analyzing exome sequencing data, due to such facts as the large number of sequenced variants, the presence of non-negligible fraction of pathogenic rare variants or de novo mutations, and ...
متن کاملCRB1-Related Leber Congenital Amaurosis: Reporting Novel Pathogenic Variants and a Brief Review on Mutations Spectrum
Background: Leber congenital amaurosis (LCA) is a rare inherited retinal disease causing severe visual impairment in infancy. It has been reported that 9-15% of LCA cases have mutations in CRB1 gene. The complex of CRB1 protein with other associated proteins affects the determination of cell polarity, orientation, and morphogenesis of photoreceptors. Here, we report three novel pathogenic varia...
متن کاملI-37: Establishing High Resolution Genomic Profiles of Single Cells Using Microarray and Next-Generation Sequencing Technologies
The nature and pace of genome mutation is largely unknown. Standard methods to investigate DNA-mutation rely on arraying or sequencing DNA from a population of cells, hence the genetic composition of individual cells is lost and de novo mutation in cell(s) is concealed within the bulk signal. We developed methods based on (SNP-) arraying and next-generation sequencing of single-cell whole-genom...
متن کاملNext-Generation Sequencing Reveals One Novel Missense Mutation in COL1A2 Gene in an Iranian Family with Osteogenesis imperfecta
Background: Osteogenesis imperfecta (OI) is a clinically and genetically heterogeneous disorder characterized by bone loss and bone fragility. The aim of this study was to investigate the variants of three genes involved in the pathogenesis of OI. Methods: Molecular genetic analyses were performed for COL1A1, COL1A2, and CRTAP genes in an Iranian family with OI. The DNA samples were analyzed by...
متن کاملUsing Sequence Variants in Linkage Disequilibrium with Causative Mutations to Improve Across-Breed Prediction in Dairy Cattle: A Simulation Study
Sequence data are expected to increase the reliability of genomic prediction by containing causative mutations directly, especially in cases where low linkage disequilibrium between markers and causative mutations limits prediction reliability, such as across-breed prediction in dairy cattle. In practice, the causative mutations are unknown, and prediction with only variants in perfect linkage ...
متن کامل