Searching for repeats, as an example of using the generalised Ruzzo-Tompa algorithm to find optimal subsequences with gaps

نویسندگان

  • John L. Spouge
  • Leonardo Mariño-Ramírez
  • Sergey Sheetlin
چکیده

Some biological sequences contain subsequences of unusual composition; e.g. some proteins contain DNA binding domains, transmembrane regions and charged regions, and some DNA sequences contain repeats. The linear-time Ruzzo-Tompa (RT) algorithm finds subsequences of unusual composition, using a sequence of scores as input and the corresponding 'maximal segments' as output. In principle, permitting gaps in the output subsequences could improve sensitivity. Here, the input of the RT algorithm is generalised to a finite, totally ordered, weighted graph, so the algorithm locates paths of maximal weight through increasing but not necessarily adjacent vertices. By permitting the penalised deletion of unfavourable letters, the generalisation therefore includes gaps. The program RepWords, which finds inexact simple repeats in DNA, exemplifies the general concepts by out-performing a similar extant, ad hoc tool. With minimal programming effort, the generalised Ruzzo-Tompa algorithm could improve the performance of many programs for finding biological subsequences of unusual composition.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Linear Time Algorithm for Finding All Maximal Scoring Subsequences

Given a sequence of real numbers ("scores"), we present a practical linear time algorithm to find those nonoverlapping, contiguous subsequences having greatest total scores. This improves on the best previously known algorithm, which requires quadratic time in the worst case. The problem arises in biological sequence analysis, where the high-scoring subsequences correspond to regions of unusual...

متن کامل

A full ranking method using integrated DEA models and its application to modify GA for finding Pareto optimal solution of MOP problem

This paper uses integrated Data Envelopment Analysis (DEA) models to rank all extreme and non-extreme efficient Decision Making Units (DMUs) and then applies integrated DEA ranking method as a criterion to modify Genetic Algorithm (GA) for finding Pareto optimal solutions of a Multi Objective Programming (MOP) problem. The researchers have used ranking method as a shortcut way to modify GA to d...

متن کامل

Control of nonlinear systems using a hybrid APSO-BFO algorithm: An optimum design of PID controller

This paper proposes a novel hybrid algorithm namely APSO-BFO which combines merits of Bacterial Foraging Optimization (BFO) algorithm and Adaptive Particle Swarm Optimization (APSO) algorithm to determine the optimal PID parameters for control of nonlinear systems. To balance between exploration and exploitation, the proposed hybrid algorithm accomplishes global search over the whole search spa...

متن کامل

Control of nonlinear systems using a hybrid APSO-BFO algorithm: An optimum design of PID controller

This paper proposes a novel hybrid algorithm namely APSO-BFO which combines merits of Bacterial Foraging Optimization (BFO) algorithm and Adaptive Particle Swarm Optimization (APSO) algorithm to determine the optimal PID parameters for control of nonlinear systems. To balance between exploration and exploitation, the proposed hybrid algorithm accomplishes global search over the whole search spa...

متن کامل

Constrained Nonlinear Optimal Control via a Hybrid BA-SD

The non-convex behavior presented by nonlinear systems limits the application of classical optimization techniques to solve optimal control problems for these kinds of systems. This paper proposes a hybrid algorithm, namely BA-SD, by combining Bee algorithm (BA) with steepest descent (SD) method for numerically solving nonlinear optimal control (NOC) problems. The proposed algorithm includes th...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • International journal of bioinformatics research and applications

دوره 10 4-5  شماره 

صفحات  -

تاریخ انتشار 2014