One-Annotated Constrained Sequence Alignment

نویسندگان

  • YUN-SHENG CHUNG
  • CHUAN YI TANG
چکیده

The constrained multiple sequence alignment (CMSA) problem is to align a set of strings such that the given patterns (the constraint) appear in the same positions in a specified order in each of the strings in the resulting alignment. The best previous result for the pair-wise version takes O(mn) time and space [2, 10], where m is the number of patterns (defined later) and n is the maximum string lengths. In this paper, we deal with the pair-wise case when the positions of occurrences of the patterns in one of the strings are given. This version arises in applications naturally but is not discussed previously [8, 2, 10]. In this paper, we present an algorithm taking O(n) time and O(n + r) space for this version, where r is the number of occurrences of all the patterns. This result in turn improves the 2-approximation algorithm proposed in [2] for CMSA from O(Ckmn) time and O(kmn) space to O(Ckn) time and O(kn) space for the original problem, where k is the number of sequences and C is the maximum number of valid “constrained lists” (defined later). Key-Words: biological sequence comparison, constrained sequence alignment, computational biology

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Constrained Sequence Alignment: A Dedicated Version and Its Applications

In this paper, we study a problem that arises naturally in biological applications. Given two sequences, along with a sequence of patterns, we want to align the two sequences such that the specified patterns are aligned together. This is the constrained sequence alignment problem and is defined in [14]. The multiple sequence version is called CMSA. In this paper, we focus on the pairwise versio...

متن کامل

Identifying a High Fraction of the Human Genome to be under Selective Constraint Using GERP++

Computational efforts to identify functional elements within genomes leverage comparative sequence information by looking for regions that exhibit evidence of selective constraint. One way of detecting constrained elements is to follow a bottom-up approach by computing constraint scores for individual positions of a multiple alignment and then defining constrained elements as segments of contig...

متن کامل

An Algorithm and Applications to Sequence Alignment with Weighted Constraints

Given two sequences S1, S2, and a constrained sequence C, a longest common subsequence of S1, S2 with restriction to C is called a constrained longest common subsequence of S1 and S2 with C. At the same time, an optimal alignment of S1, S2 with restriction to C is called a constrained pairwise sequence alignment of S1 and S2 with C. Previous algorithms have shown that the constrained longest co...

متن کامل

A Parallel GPU-Designed Algorithm for the Constrained Multiple Sequence Alignment Problem

Modern graphical processing units (GPUs) offer much more computational power than modern CPUs, so it is natural that GPUs are often used for solving many computationally-intensive problems. One of the tasks of huge importance in bioinformatics is sequence alignment. We investigate its variant introduced a few years ago in which some additional requirement on the alignment is given. As a result ...

متن کامل

ORE extraction and blending optimization model in poly- metallic open PIT mines by chance constrained one-sided goal programming

Determination a sequence of extracting ore is one of the most important problems in mine annual production scheduling. Production scheduling affects mining performance especially in a poly-metallic open pit mine with considering the imposed operational and physical constraints mandated by high levels of reliability in relation to the obtained actual results. One of the important operational con...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004