Multiple Partial Order Alignment as a Graph Problem
نویسندگان
چکیده
Multiple Sequence Alignment (MSA) is a fundamental tool of bioinformatics. Row-Column MSA (RC-MSA) methods such as CLUSTALW [12] produce tabular alignments that are now familiar. However, these methods have a number of shortcomings, including difficulty of understanding the result, high computational complexity, questionable assumptions, and other artifacts (such as poor handling of prefixand suffix-alignment). Partial Order Alignment was proposed recently as an alternative approach to MSA. Partial Order MSA (PO-MSA) methods produce a partial order — a labeled directed acyclic graph — that includes the input sequences as subgraphs. The approaches differ in their strengths and weaknesses as well as their assumptions. In this paper, we formalize PO-MSA as a graph problem, show that it corresponds to finding a Minimal Common Supergraph for a set of partial order graphs, and characterize how such a supergraph can be derived. This formalization offers some perspective on MSA generally, and also on particular tradeoffs between RC-MSA and PO-MSA.
منابع مشابه
Pairwise Partial Order Alignment as a Supergraph Problem — Aligning Alignments Revisited
Partial Order Alignment (POA) has been proposed recently as an alternative to conventional sequence alignment. Instead of the familiar tabular alignments, POA methods produce a partial order — a labeled directed acyclic graph — that includes the input sequences. In this paper, we formalize POA in terms of graphs, and show it corresponds to finding a Minimal Common Supergraph for a set of partia...
متن کاملMultiple sequence alignment using partial order graphs
MOTIVATION Progressive Multiple Sequence Alignment (MSA) methods depend on reducing an MSA to a linear profile for each alignment step. However, this leads to loss of information needed for accurate alignment, and gap scoring artifacts. RESULTS We present a graph representation of an MSA that can itself be aligned directly by pairwise dynamic programming, eliminating the need to reduce the MS...
متن کاملMultiple flexible structure alignment using partial order graphs
MOTIVATION Existing comparisons of protein structures are not able to describe structural divergence and flexibility in the structures being compared because they focus on identifying a common invariant core and ignore parts of the structures outside this core. Understanding the structural divergence and flexibility is critical for studying the evolution of functions and specificities of protei...
متن کاملThe Footprint Sorting Problem
Phylogenetic footprints are short pieces of noncoding DNA sequence in the vicinity of a gene that are conserved between evolutionary distant species. A seemingly simple problem is to sort footprints in their order along the genomes. It is complicated by the fact that not all footprints are collinear: they may cross each other. The problem thus becomes the identification of the crossing footprin...
متن کاملA novel method for multiple alignment of sequences with repeated and shuffled elements.
We describe ABA (A-Bruijn alignment), a new method for multiple alignment of biological sequences. The major difference between ABA and existing multiple alignment methods is that ABA represents an alignment as a directed graph, possibly containing cycles. This representation provides more flexibility than does a traditional alignment matrix or the recently introduced partial order alignment (P...
متن کامل