Computational Biology Lecture 11: Pairwise alignment using HMMs

نویسنده

  • Saad Mneimneh
چکیده

We looked at various alignment algorithms with different scoring schemes. We argued that the score of an alignment is related to the relative likelihood that the two sequences are related compared to being unreleated, and we used the log-odds ratio to express this relative likelihood while maintaining an additive scoring scheme. Therefore, maximizing the score of an alignment was in some sense equivalent to maximizing the log-odds ratio, with the exception that gaps are scored separately and are not related to the log-odds ratio. Recall that we have considered only ungapped alignments in deriving the scores that relate to the log-odds ratio, and assumed a separate model for scoring gaps; for instance, an affine gap penalty function. Now we will unify both models into a single probabilistic model and see how the score of a gapped alignment of the Needleman-Wunsch algorithm can be viewed as a maximum log-odds ratio obtained by a Viterbi algorithm for an HMM that generates two sequences simultaneously.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

gpALIGNER: A Fast Algorithm for Global Pairwise Alignment of DNA Sequences

Bioinformatics, through the sequencing of the full genomes for many species, is increasingly relying on efficient global alignment tools exhibiting both high sensitivity and specificity. Many computational algorithms have been applied for solving the sequence alignment problem. Dynamic programming, statistical methods, approximation and heuristic algorithms are the most common methods appli...

متن کامل

Aligning sequences with repetitive motifs

Pairwise sequence alignment is among the most intensively studied problems in computational biology. We present a method for alignment of two sequences containing repetitive motifs. This is motivated by biological studies of proteins with zinc finger domain, an important group of regulatory proteins. Due to their evolutionary history, sequences of these proteins contain a variable number of dif...

متن کامل

Homology Detection via Family Pairwise Search

The function of an unknown biological sequence can often be accurately inferred by identifying sequences homologous to the original sequence. Given a query set of known homologs, there exist at least three general classes of techniques for finding additional homologs: pairwise sequence comparisons, motif analysis, and hidden Markov modeling. Pairwise sequence comparisons are typically employed ...

متن کامل

Combining Pairwise Sequence Similarity and Support Vector Machines for Detecting Remote Protein Evolutionary and Structural Relationships

One key element in understanding the molecular machinery of the cell is to understand the structure and function of each protein encoded in the genome. A very successful means of inferring the structure or function of a previously unannotated protein is via sequence similarity with one or more proteins whose structure or function is already known. Toward this end, we propose a means of represen...

متن کامل

Pattern Matching Techniques and Their Applications to Computational Molecular Biology - A Review

Pattern matching techniques have been useful in solving many problems associated with computer science, including data compression (Chrochemore and Lecroq, 1996), data encryption (RSA Laboratories, 1993), and computer vision (Grimson and Huttenlocher, 1990). In recent years, developments in molecular biology have led to large scale sequencing of genomic DNA. Since this data is being produced in...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004