Rapid Homology Search with Two-Stage Extension and Daughter Seeds
نویسندگان
چکیده
Using a seed to rapidly “hit” possible homologies for further examination is a common practice to speed up homology search in molecular sequences. It has been shown that a collection of higher weight seeds have better sensitivity than a single lower weight seed at the same speed. However, huge memory requirements diminish the advantages of high weight seeds. This paper describes a twostage extension method, which simulates high weight seeds with modest memory requirements. The paper also proposes the use of so-called daughter seeds, which is an extension of the previously studied vector seed idea. Daughter seeds, especially when combined with the two-stage extension, provide the flexibility to maximize the independence between the seeds, which is a well-known criterion for maximizing sensitivity. Some other practical techniques to reduce memory usage are also discussed in the paper.
منابع مشابه
Subset Seed Extension to Protein BLAST
A bstract: The seeding technique became central in the theory of sequence alignment and there are several efficient tools applying seeds to D N A homology search. Recently, a concept of subset seeds has been proposed for similarity search in protein sequences. We experimentally evaluate the applicability of subset seeds to protein homology search. We advocate the use of multiple subset seeds de...
متن کاملSensitivity analysis and efficient method for identifying optimal spaced seeds
The novel introduction of spaced seed idea in the filtration stage of sequence comparison by Ma et al. (Bioinformatics 18 (2002) 440) has greatly increased the sensitivity of homology search without compromising the speed of search. Finding the optimal spaced seeds is of great importance both theoretically and in designing better search tool for sequence comparison. In this paper, we study the ...
متن کاملMultiple spaced seeds for homology search
MOTIVATION Homology search finds similar segments between two biological sequences, such as DNA or protein sequences. The introduction of optimal spaced seeds in PatternHunter has increased both the sensitivity and the speed of homology search, and it has been adopted by many alignment programs such as BLAST. With the further improvement provided by multiple spaced seeds in PatternHunterII, Smi...
متن کاملHalopriming and Hydropriming Treatments to Overcome Salt and Drought Stress at Germination Stage of Corn (Zea mays L.)
To study the effects of halopriming and hydropriming in overcoming salt and drought stress in corn (Zea mays L.),two experiments were separately conducted at Shahrood University of Technology. Seed treatments consisted ofcontrol (untreated seeds), soaking in distilled water for 32 h (hydropriming), and soaking in 50 mmol solution ofCaCl2 for 16 h (halopriming). Germination and early seedling gr...
متن کاملLong spaced seeds for finding similarities between biological sequences
Homology search finds similar segments between two biological sequences, such as DNA or protein sequences. A significant fraction of the computing power in the world is devoted to finding similarities between biological sequences. The introduction of optimal spaced seeds in [Ma et al., Bioinformatics 18 (2002) 440–445] has increased both the sensitivity and the speed of homology search and it h...
متن کامل