Likelihood vs. Information in Aligning Biopolymer Sequences

نویسنده

  • Timothy L. Bailey
چکیده

Biopolymer sequences often contain regions of similarity with other sequences due to homology or common function. A common method of discovering patterns in biopolymer sequences is to align a set of sequences so that certain columns of the alignment have highly non-random residue frequency distributions. The pattern can then be described in terms of a consensus pattern, motif, proole, speci-city matrix or regular expression. This research note shows that a commonly used method of measuring the \goodness" of an alignment based on information theory is actually equivalent to maximizing the likelihood ratio of two hypotheses when the assumed probability distribution is multinomial. In addition, a method which has been used by other workers for determining whether a new sequence contains the pattern is shown to be essentially equivalent to a likelihood ratio. This ooers a new, uniform way of thinking about the information contained in a set of aligned sequences which is more intuitive, and may aid the development of improved algorithms.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Aligning Alignments

While the area of sequence comparison has a rich collection of results on the alignment of two sequences, and even the alignment of multiple sequences, there is little known about the alignment of two alignments. The problem becomes interesting when the alignment objective function counts gaps, as is common when aligning biological sequences, and has the form of the sum-of-pairs objective. We b...

متن کامل

An Evolutionary and Phylogenetic Study of the BMP15 Gene

DNA sequence data contains a wealth of biologically useful information. Recent innovations in DNA sequencing technology have greatly increased our capacity to determine massive amounts of nucleotide sequences. These sequences can be used to specify the characteristics of different regions, interpret the evolutionary relationships between categorized groups, likelihood of performing multiple com...

متن کامل

Aligning short sequencing reads with Bowtie.

This unit shows how to use the Bowtie package to align short sequencing reads, such as those output by second-generation sequencing instruments. It also includes protocols for building a genome index and calling consensus sequences from Bowtie alignments using SAMtools.

متن کامل

Automated Reconstruction of Whole-Genome Phylogenies from Short-Sequence Reads

Studies of microbial evolutionary dynamics are being transformed by the availability of affordable high-throughput sequencing technologies, which allow whole-genome sequencing of hundreds of related taxa in a single study. Reconstructing a phylogenetic tree of these taxa is generally a crucial step in any evolutionary analysis. Instead of constructing genome assemblies for all taxa, annotating ...

متن کامل

Op-molb140061 1077..1088

Studies of microbial evolutionary dynamics are being transformed by the availability of affordable high-throughput sequencing technologies, which allow whole-genome sequencing of hundreds of related taxa in a single study. Reconstructing a phylogenetic tree of these taxa is generally a crucial step in any evolutionary analysis. Instead of constructing genome assemblies for all taxa, annotating ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1993