Parametric k-best alignment
نویسندگان
چکیده
Optimal sequence alignments depend heavily on alignment scoring parameters. Given input sequences, parametric alignment is the well-studied problem that asks for all possible optimal alignment summaries as parameters vary, as well as the optimality region of alignment scoring parameters which yield each optimal alignment. But biologically correct alignments might be suboptimal for all parameter choices. Thus we extend parametric alignment to parametric k-best alignment, which asks for all possible k-tuples of k-best alignment summaries (s1, s2, . . . , sk), as well as the k-best optimality region of scoring parameters which make s1, s2, . . . , sk the top k summaries. By exploiting the integer-structure of alignment summaries, we show that, astonishingly, the complexity of parametric k-best alignment is only polynomial in k. Thus parametric k-best alignment is tractable, and can be applied at the whole-genome scale like parametric alignment. Corresponding author: name: Peter Huggins email address: [email protected] address:
منابع مشابه
Parametric Alignment of Multiple Biological Sequences
The alignment problem of DNA or protein sequences is very applicable and important in various elds of molecular biology. In this problem, the obtained optimal solution with xed parameters (gap penalties, weights for weighted multiple alignment problems, and so on) is not always the biologically best alignment. Thus, it is required to vary parameters and check the varying optimal alignments. The...
متن کاملBounds for Parametric Sequence Comparison
We consider the problem of computing a global alignment between two or more sequences subject to varying mismatch and indel penalties. We prove a tight 3(n=2 )+O(n log n) bound on the worst-case number of distinct optimum alignments for two sequences of length n as the parameters are varied. This re7nes a O(n) upper bound by Gus7eld et al., answering a question posed by Pevzner and Waterman. Ou...
متن کاملInverse Parametric Alignment for Accurate Biological Sequence Comparison
For as long as biologists have been computing alignments of sequences, the question of what values to use for scoring substitutions and gaps has persisted. In practice, substitution scores are usually chosen by convention, and gap penalties are often found by trial and error. In contrast, a rigorous way to determine parameter values that are appropriate for aligning biological sequences is by s...
متن کاملHomology modeling using parametric alignment ensemble generation with consensus and energy-based model selection
The accuracy of a homology model based on the structure of a distant relative or other topologically equivalent protein is primarily limited by the quality of the alignment. Here we describe a systematic approach for sequence-to-structure alignment, called 'K*Sync', in which alignments are generated by dynamic programming using a scoring function that combines information on many protein featur...
متن کاملDependency Forest based Word Alignment
A hierarchical word alignment model that searches for k-best partial alignments on target constituent 1-best parse trees has been shown to outperform previous models. However, relying solely on 1-best parses trees might hinder the search for good alignments because 1-best trees are not necessarily the best for word alignment tasks in practice. This paper introduces a dependency forest based wor...
متن کامل