IMHOTEP—a composite score integrating popular tools for predicting the functional consequences of non-synonymous sequence variants

نویسندگان

  • Carolin Knecht
  • Matthew Mort
  • Olaf Junge
  • David N. Cooper
  • Michael Krawczak
  • Amke Caliebe
چکیده

The in silico prediction of the functional consequences of mutations is an important goal of human pathogenetics. However, bioinformatic tools that classify mutations according to their functionality employ different algorithms so that predictions may vary markedly between tools. We therefore integrated nine popular prediction tools (PolyPhen-2, SNPs&GO, MutPred, SIFT, MutationTaster2, Mutation Assessor and FATHMM as well as conservation-based Grantham Score and PhyloP) into a single predictor. The optimal combination of these tools was selected by means of a wide range of statistical modeling techniques, drawing upon 10 029 disease-causing single nucleotide variants (SNVs) from Human Gene Mutation Database and 10 002 putatively ‘benign’ non-synonymous SNVs from UCSC. Predictive performance was found to be markedly improved by model-based integration, whilst maximum predictive capability was obtained with either random forest, decision tree or logistic regression analysis. A combination of PolyPhen-2, SNPs&GO, MutPred, MutationTaster2 and FATHMM was found to perform as well as all tools combined. Comparison of our approach with other integrative approaches such as Condel, CoVEC, CAROL, CADD, MetaSVM and MetaLR using an independent validation dataset, revealed the superiority of our newly proposed integrative approach. An online implementation of this approach, IMHOTEP (‘Integrating Molecular Heuristics and Other Tools for Effect Prediction’), is provided at http://www.uni-kiel.de/medinfo/cgi-bin/predictor/.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A combined functional annotation score for non-synonymous variants.

AIMS Next-generation sequencing has opened the possibility of large-scale sequence-based disease association studies. A major challenge in interpreting whole-exome data is predicting which of the discovered variants are deleterious or neutral. To address this question in silico, we have developed a score called Combined Annotation scoRing toOL (CAROL), which combines information from 2 bioinfor...

متن کامل

A Bioinformatics Approach to Prioritize Single Nucleotide Polymorphisms in TLRs Signaling Pathway Genes

It has been suggested that single nucleotide polymorphisms (SNPs) in genes involved in Toll-like receptors (TLRs) pathway may exhibit broad effects on function of this network and might contribute to a range of human diseases. However, the extent to which these variations affect TLR signaling is not well understood. In this study, we adopted a bioinformatics approach to predict the consequences...

متن کامل

In-silico study to identify the pathogenic single nucleotide polymorphisms in the coding region of CDKN2A gene

Background: CDKN2A, encoding two important tumor suppressor proteins p16 and p14, is a tumor suppressor gene. Mutations in this gene and subsequently the defect in p16 and p14 proteins lead to the downregulation of RB1/p53 and cancer malignancy. To identify the structural and functional effects of mutations, various powerful bioinformatics tools are available. The aim of this study is the ident...

متن کامل

Comprehensive Analysis of Non-Synonymous Natural Variants of G Protein-Coupled Receptors

G protein-coupled receptors (GPCRs) are the largest superfamily of transmembrane receptors and have vital signaling functions in various organs. Because of their critical roles in physiology and pathology, GPCRs are the most commonly used therapeutic target. It has been suggested that GPCRs undergo massive genetic variations such as genetic polymorphisms and DNA insertions or deletions. Among t...

متن کامل

Comprehensive Computational Analysis of Protein Phenotype Changes Due to Plausible Deleterious Variants of Human SPTLC1 Gene

Genetic variations found in the coding and non-coding regions of a gene are known to influence the structure as well as the function of proteins. Serine palmitoyltransferase long chain subunit 1 a member of α-oxoamine synthase family is encoded by SPTLC1 gene which is a subunit of enzyme serine palmitoyltransferase (SPT). Mutations in SPTLC1 have been associated with hereditary sensory and auto...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره 45  شماره 

صفحات  -

تاریخ انتشار 2016