Statistical binning enables an accurate coalescent-based estimation of the avian tree.

نویسندگان

  • Siavash Mirarab
  • Md Shamsuzzoha Bayzid
  • Bastien Boussau
  • Tandy Warnow
چکیده

Gene tree incongruence arising from incomplete lineage sorting (ILS) can reduce the accuracy of concatenation-based estimations of species trees. Although coalescent-based species tree estimation methods can have good accuracy in the presence of ILS, they are sensitive to gene tree estimation error. We propose a pipeline that uses bootstrapping to evaluate whether two genes are likely to have the same tree, then it groups genes into sets using a graph-theoretic optimization and estimates a tree on each subset using concatenation, and finally produces an estimated species tree from these trees using the preferred coalescent-based method. Statistical binning improves the accuracy of MP-EST, a popular coalescent-based method, and we use it to produce the first genome-scale coalescent-based avian tree of life.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Response to Comment on "Statistical binning enables an accurate coalescent-based estimation of the avian tree".

Liu and Edwards argue against the use of weighted statistical binning within a species tree estimation pipeline. However, we show that their mathematical argument does not apply to weighted statistical binning. Furthermore, their simulation study does not follow the recommended statistical binning protocol and has data of unknown origin that bias the results against weighted statistical binning.

متن کامل

Comment on "Statistical binning enables an accurate coalescent-based estimation of the avian tree".

Mirarab et al. (Research Article, 12 December 2014, p. 1250463) introduced statistical binning to improve the signal in phylogenetic methods using the multispecies coalescent model. We show that all forms of binning-naïve, statistical, and weighted statistical-display poor performance and are statistically inconsistent in large regions of parameter space, unlike unbinned sequence data used with...

متن کامل

Weighted Statistical Binning: Enabling Statistically Consistent Genome-Scale Phylogenetic Analyses

Because biological processes can result in different loci having different evolutionary histories, species tree estimation requires multiple loci from across multiple genomes. While many processes can result in discord between gene trees and species trees, incomplete lineage sorting (ILS), modeled by the multi-species coalescent, is considered to be a dominant cause for gene tree heterogeneity....

متن کامل

Concatenation Analyses in the Presence of Incomplete Lineage Sorting Œ PLOS Currents Tree of Life

Incomplete lineage sorting (ILS), modelled by the multi-species coalescent, is a process that results in a gene tree being different from the species tree. Because ILS is expected to occur for at least some loci within genome-scale analyses, the evaluation of species tree estimation methods in the presence of ILS is of great interest. Performance on simulated and biological data have suggested ...

متن کامل

Fast Coalescent-Based Computation of Local Branch Support from Quartet Frequencies

Species tree reconstruction is complicated by effects of incomplete lineage sorting, commonly modeled by the multi-species coalescent model (MSC). While there has been substantial progress in developing methods that estimate a species tree given a collection of gene trees, less attention has been paid to fast and accurate methods of quantifying support. In this article, we propose a fast algori...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Science

دوره 346 6215  شماره 

صفحات  -

تاریخ انتشار 2014