A Linear Time Approximation Scheme for Maximum Quartet Consistency on Sparse Sampled Inputs

نویسندگان

  • Sagi Snir
  • Raphael Yuster
چکیده

Phylogenetic tree reconstruction is a fundamental biological problem. Quartet amalgamation combining a set of trees over four taxa into a tree over the full set stands at the heart of many phylogenetic reconstruction methods. This task has attracted many theoretical as well as practical works. However, even reconstruction from a consistent set of quartet trees, i.e. all quartets agree with some tree, is NP-hard, and the best approximation ratio known is 1/3. For a dense input of Θ(n) quartets that are not necessarily consistent, the problem has a polynomial time approximation scheme. When the number of taxa grows, considering such dense inputs is impractical and some sampling approach is imperative. It is known that given a randomly sampled consistent set of quartets from an unknown phylogeny, one can find, in polynomial time and with high probability, a tree satisfying a 0.425 fraction of them, an improvement over the 1/3 ratio. In this paper we further show that given a randomly sampled consistent set of quartets from an unknown phylogeny, where the size of the sample is at least Θ(n log n), there is a randomized approximation scheme that runs in linear time in the number of quartets. The previously known polynomial approximation scheme for that problem required a very dense sample of size Θ(n). We note that samples of size Θ(n log n) are sparse in the full quartet set. The result is obtained by a combinatorial technique that may be of independent interest. keywords: phylogenetic reconstruction, quartet amalgamation, approximation scheme.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Lookahead Branch-and-Bound Algorithm for the Maximum Quartet Consistency Problem

A lookahead branch-and-bound algorithm is proposed for solving the Maximum Quartet Consistency Problem where the input is a complete set of quartets on the taxa and the goal is to construct a phylogeny which satisfies the maximum number of given quartets. Such a phylogeny constructed from quartets has many advantages over phylogenies constructed through other ways, one of which is that it is ab...

متن کامل

Approximating minimum quartet inconsistency (abstract)

A fundamental problem in computational biology which has been widely studied in the last decades is the reconstruction of evolutionary trees from biological data. Unfortunately, almost all its known formulations are NPhard. The compelling need for having efficient computational tools to solve this biological problem has brought a lot of attention to the analysis of the quartet paradigm for infe...

متن کامل

A Polynomial Time Approximation Scheme for Inferring Evolutionary Trees from Quartet Topologies and Its Application

Inferring evolutionary trees has long been a challenging problem both for biologists and computer scientists. In recent years research has concentrated on the quartet method paradigm for inferring evolutionary trees. Quartet methods proceed by rst inferring the evolutionary history for every set of four species (resulting in a set Q of inferred quartet topologies) and then recombining these inf...

متن کامل

Approximation of stochastic advection diffusion equations with finite difference scheme

In this paper, a high-order and conditionally stable stochastic difference scheme is proposed for the numerical solution of $rm Ithat{o}$ stochastic advection diffusion equation with one dimensional white noise process. We applied a finite difference approximation of fourth-order for discretizing space spatial derivative of this equation. The main properties of deterministic difference schemes,...

متن کامل

Space-Efficient Approximation Scheme for Maximum Matching in Sparse Graphs

We present a Logspace Approximation Scheme (LSAS), i.e. an approximation algorithm for maximum matching in planar graphs (not necessarily bipartite) that achieves an approximation ratio arbitrarily close to one, using only logarithmic space. This deviates from the well known Baker’s approach for approximation in planar graphs by avoiding the use of distance computation which is not known to be ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • SIAM J. Discrete Math.

دوره 25  شماره 

صفحات  -

تاریخ انتشار 2011