Detecting recent positive selection with high accuracy and reliability by conditional coalescent tree.

نویسندگان

  • Minxian Wang
  • Xin Huang
  • Ran Li
  • Hongyang Xu
  • Li Jin
  • Yungang He
چکیده

Studies of natural selection, followed by functional validation, are shedding light on understanding of genetic mechanisms underlying human evolution and adaptation. Classic methods for detecting selection, such as the integrated haplotype score (iHS) and Fay and Wu's H statistic, are useful for candidate gene searching underlying positive selection. These methods, however, have limited capability to localize causal variants in selection target regions. In this study, we developed a novel method based on conditional coalescent tree to detect recent positive selection by counting unbalanced mutations on coalescent gene genealogies. Extensive simulation studies revealed that our method is more robust than many other approaches against biases due to various demographic effects, including population bottleneck, expansion, or stratification, while not sacrificing its power. Furthermore, our method demonstrated its superiority in localizing causal variants from massive linked genetic variants. The rate of successful localization was about 20-40% higher than that of other state-of-the-art methods on simulated data sets. On empirical data, validated functional causal variants of four well-known positive selected genes were all successfully localized by our method, such as ADH1B, MCM6, APOL1, and HBB. Finally, the computational efficiency of this new method was much higher than that of iHS implementations, that is, 24-66 times faster than the REHH package, and more than 10,000 times faster than the original iHS implementation. These magnitudes make our method suitable for applying on large sequencing data sets. Software can be downloaded from https://github.com/wavefancy/scct.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A hybrid model based on machine learning and genetic algorithm for detecting fraud in financial statements

Financial statement fraud has increasingly become a serious problem for business, government, and investors. In fact, this threatens the reliability of capital markets, corporate heads, and even the audit profession. Auditors in particular face their apparent inability to detect large-scale fraud, and there are various ways to identify this problem. In order to identify this problem, the majori...

متن کامل

Effect of recombination on the accuracy of the likelihood method for detecting positive selection at amino acid sites.

Maximum-likelihood methods based on models of codon substitution accounting for heterogeneous selective pressures across sites have proved to be powerful in detecting positive selection in protein-coding DNA sequences. Those methods are phylogeny based and do not account for the effects of recombination. When recombination occurs, such as in population data, no unique tree topology can describe...

متن کامل

A Novel Architecture for Detecting Phishing Webpages using Cost-based Feature Selection

Phishing is one of the luring techniques used to exploit personal information. A phishing webpage detection system (PWDS) extracts features to determine whether it is a phishing webpage or not. Selecting appropriate features improves the performance of PWDS. Performance criteria are detection accuracy and system response time. The major time consumed by PWDS arises from feature extraction that ...

متن کامل

The Accuracy of Senior Students of Rasht Dental School in Detecting Proximal Caries in Digital Bitewing Radiographs

Introduction: Dental caries is one of the most common chronic diseases in the world. Dentists acquire the ability to correctly identify caries through training. In addition to clinical examination, the use of radiographic techniques, especially the bitewing technique, are the main tools for the accurate detection of caries. The present study was conducted to investigate the accuracy of senior s...

متن کامل

Anomaly Detection Using SVM as Classifier and Decision Tree for Optimizing Feature Vectors

Abstract- With the advancement and development of computer network technologies, the way for intruders has become smoother; therefore, to detect threats and attacks, the importance of intrusion detection systems (IDS) as one of the key elements of security is increasing. One of the challenges of intrusion detection systems is managing of the large amount of network traffic features. Removing un...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Molecular biology and evolution

دوره 31 11  شماره 

صفحات  -

تاریخ انتشار 2014