Rapid Homology Search with Two-Stage Extension and Daughter Seeds

نویسندگان

  • Miklós Csürös
  • Bin Ma
چکیده

Using a seed to rapidly “hit” possible homologies for further examination is a common practice to speed up homology search in molecular sequences. It has been shown that a collection of higher weight seeds have better sensitivity than a single lower weight seed at the same speed. However, huge memory requirements diminish the advantages of high weight seeds. This paper describes a twostage extension method, which simulates high weight seeds with modest memory requirements. The paper also proposes the use of so-called daughter seeds, which is an extension of the previously studied vector seed idea. Daughter seeds, especially when combined with the two-stage extension, provide the flexibility to maximize the independence between the seeds, which is a well-known criterion for maximizing sensitivity. Some other practical techniques to reduce memory usage are also discussed in the paper.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Subset Seed Extension to Protein BLAST

A bstract: The seeding technique became central in the theory of sequence alignment and there are several efficient tools applying seeds to D N A homology search. Recently, a concept of subset seeds has been proposed for similarity search in protein sequences. We experimentally evaluate the applicability of subset seeds to protein homology search. We advocate the use of multiple subset seeds de...

متن کامل

Sensitivity analysis and efficient method for identifying optimal spaced seeds

The novel introduction of spaced seed idea in the filtration stage of sequence comparison by Ma et al. (Bioinformatics 18 (2002) 440) has greatly increased the sensitivity of homology search without compromising the speed of search. Finding the optimal spaced seeds is of great importance both theoretically and in designing better search tool for sequence comparison. In this paper, we study the ...

متن کامل

Multiple spaced seeds for homology search

MOTIVATION Homology search finds similar segments between two biological sequences, such as DNA or protein sequences. The introduction of optimal spaced seeds in PatternHunter has increased both the sensitivity and the speed of homology search, and it has been adopted by many alignment programs such as BLAST. With the further improvement provided by multiple spaced seeds in PatternHunterII, Smi...

متن کامل

Halopriming and Hydropriming Treatments to Overcome Salt and Drought Stress at Germination Stage of Corn (Zea mays L.)

To study the effects of halopriming and hydropriming in overcoming salt and drought stress in corn (Zea mays L.),two experiments were separately conducted at Shahrood University of Technology. Seed treatments consisted ofcontrol (untreated seeds), soaking in distilled water for 32 h (hydropriming), and soaking in 50 mmol solution ofCaCl2 for 16 h (halopriming). Germination and early seedling gr...

متن کامل

Long spaced seeds for finding similarities between biological sequences

Homology search finds similar segments between two biological sequences, such as DNA or protein sequences. A significant fraction of the computing power in the world is devoted to finding similarities between biological sequences. The introduction of optimal spaced seeds in [Ma et al., Bioinformatics 18 (2002) 440–445] has increased both the sensitivity and the speed of homology search and it h...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005