Inference of Large Phylogenies Using Neighbour-Joining
نویسندگان
چکیده
The neighbour-joining method is a widely used method for phylogenetic reconstruction which scales to thousands of taxa. However, advances in sequencing technology have made data sets with more than 10,000 related taxa widely available. Inference of such large phylogenies takes hours or days using the Neighbour-Joining method on a normal desktop computer because of the O(n) running time. RapidNJ is a search heuristic which reduce the running time of the Neighbour-Joining method significantly but at the cost of an increased memory consumption making inference of large phylogenies infeasible. We present two extensions for RapidNJ which reduce the memory requirements and allows phylogenies with more than 50,000 taxa to be inferred efficiently on a desktop computer. Furthermore, an improved version of the search heuristic is presented which reduces the running time of RapidNJ on many data sets.
منابع مشابه
QuickTree: building huge Neighbour-Joining trees of protein sequences
We have written a fast implementation of the popular Neighbor-Joining tree building algorithm. QuickTree allows the reconstruction of phylogenies for very large protein families (including the largest Pfam alignment containing 27000 HIV GP120 glycoprotein sequences) that would be infeasible using other popular methods.
متن کاملPhylogenetic analysis using complete signature information of whole genomes and clustered Neighbour-Joining method
A new method called Complete Composition Vector (CCV), which is a collection of Composition Vectors (CV), is described to infer evolutionary relationships between species using their complete genomic sequences. Such a method bypasses the complexity of performing multiple sequence alignments and avoids the ambiguity of choosing individual genes for species tree construction. It is expected to ef...
متن کاملBuilding Very Large Neighbour-joining Trees
The neighbour-joining method by Saitou and Nei is a widely used method for phylogenetic reconstruction, made popular by a combination of computational efficiency and reasonable accuracy. With its cubic running time by Studier and Kepler, the method scales to hundreds of species, and while it is usually possible to infer phylogenies with thousands of species, tens or hundreds of thousands of spe...
متن کاملRapid Neighbour-Joining
The neighbour-joining method reconstructs phylogenies by iteratively joining pairs of nodes until a single node remains. The criterion for which pair of nodes to merge is based on both the distance between the pair and the average distance to the rest of the nodes. In this paper, we present a new search strategy for the optimisation criteria used for selecting the next pair to merge and we show...
متن کاملProspects for inferring very large phylogenies by using the neighbor-joining method.
Current efforts to reconstruct the tree of life and histories of multigene families demand the inference of phylogenies consisting of thousands of gene sequences. However, for such large data sets even a moderate exploration of the tree space needed to identify the optimal tree is virtually impossible. For these cases the neighbor-joining (NJ) method is frequently used because of its demonstrat...
متن کامل