The Average Common Substring Approach to Phylogenomic Reconstruction
نویسندگان
چکیده
منابع مشابه
The Average Common Substring Approach to Phylogenomic Reconstruction
We describe a novel method for efficient reconstruction of phylogenetic trees, based on sequences of whole genomes or proteomes, whose lengths may greatly vary. The core of our method is a new measure of pairwise distances between sequences. This measure is based on computing the average lengths of maximum common substrings, which is intrinsically related to information theoretic tools (Kullbac...
متن کاملkmacs: the k-mismatch average common substring approach to alignment-free sequence comparison
MOTIVATION Alignment-based methods for sequence analysis have various limitations if large datasets are to be analysed. Therefore, alignment-free approaches have become popular in recent years. One of the best known alignment-free methods is the average common substring approach that defines a distance measure on sequences based on the average length of longest common words between them. Herein...
متن کاملkmacs: the k-Mismatch Avera- ge Common Substring Approach for Phylogeny Reconstruction
The vast majority of sequence comparison methods for phylogeny reconstruction rely on pairwise or multiple sequence alignments. These approaches are in practice not usable for longer sequences such as complete genomes. For this reason alignment-free methods have recently become more popular because they are much faster and usually computable in linear time. Some of these methods are based on re...
متن کاملString Reconstruction from Substring Compositions
Motivated by mass-spectrometry protein sequencing, we consider the problem of reconstructing a string from the multisets of its substring composition. We show that all strings of length 7, one less than a prime and one less than twice a prime, can be reconstructed uniquely up to reversal. For all other lengths, we show that unique reconstruction is not always possible and provide sometimes-tigh...
متن کاملSparse LCS Common Substring Alignment
The “Common Substring Alignment” problem is defined as follows. The input consists of a set of strings S1, S2 . . . , Sc , with a common substring appearing at least once in each of them, and a target string T . The goal is to compute similarity of all strings Si with T , without computing the part of the common substring over and over again. In this paper we consider the Common Substring Align...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Journal of Computational Biology
سال: 2006
ISSN: 1066-5277,1557-8666
DOI: 10.1089/cmb.2006.13.336