YASS: Similarity search in DNA sequences

نویسندگان

  • Laurent Noé
  • Gregory Kucherov
چکیده

We describe YASS – a new tool for finding local similarities in DNA sequences. The YASS algorithm first scans the sequence(s) and creates on the fly groups of seeds (small exact repeats obtained by hashing) according to statistically-founded criteria. Then it tries to extend those groups into similarity regions on the basis of a new extension criterion. The method can be seen as a compromise between single-seed (BLAST) and multiple-seed (FASTA, BLAT) approaches, and achieves a gain in both sensitivity and selectivity. The method is flexible and can be made more efficient by using spaced seeds, and in particular transitionconstrained spaced seeds. We provide examples of applying YASS to Saccharomyces Cerevisiae and Drosophila Melanogaster chromosomes. Key-words: YASS, local alignment, spaced seeds, transitions YASS: Recherche de similaritées dans les séquences d’ADN Résumé : Nous présentons YASS – un nouvel outil par la recherche locale de similaritées dans les séquences d’ADN. L’algorithme de YASS parcours la séquence dans un premier temps, et crée des groupes de graines (petites répétitions exactes obtenues par hachage) selon des critères reposant sur des propriétées statistiques. Dans un deuxième temps, il essaie d’étendre ces groupes en régions de similaritées selon un nouveau critère d’extension. La methode proposée peut être vue commme un compromis entre les stratégies à une seule graine (BLAST) et celles à multiples graines (FASTA, BLAT), elle atteind des gains à la fois sur la sensibilitée et la selectivité. La méthode reste flexible et peut être rendue encore plus efficace en utilisant des graines espacées, particulièrement en considérant des graines espacées contenant des elements spécifiques contraints aux transitions. Nous donnons des examples d’utilisation de YASS sur des chromosomes de Saccharomyces Cerevisiae et Drosophila Melanogaster. Mots-clés : YASS, alignement local, graines espacées, transitions YASS: Similarity search in DNA sequences 3

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

YASS: enhancing the sensitivity of DNA similarity search

YASS is a DNA local alignment tool based on an efficient and sensitive filtering algorithm. It applies transition-constrained seeds to specify the most probable conserved motifs between homologous sequences, combined with a flexible hit criterion used to identify groups of seeds that are likely to exhibit significant alignments. A web interface (http://www.loria.fr/projects/YASS/) is available ...

متن کامل

Development of an Efficient Hybrid Method for Motif Discovery in DNA Sequences

This work presents a hybrid method for motif discovery in DNA sequences. The proposed method called SPSO-Lk, borrows the concept of Chebyshev polynomials and uses the stochastic local search to improve the performance of the basic PSO algorithm as a motif finder. The Chebyshev polynomial concept encourages us to use a linear combination of previously discovered velocities beyond that proposed b...

متن کامل

Indexing DNA Sequences Using q-Grams

We have observed in recent years a growing interest in similarity search on large collections of biological sequences. Contributing to the interest, this paper presents a method for indexing the DNA sequences efficiently based on q-grams to facilitate similarity search in a DNA database and sidestep the need for linear scan of the entire database. Two level index – hash table and c-trees – are ...

متن کامل

Estimating the Redundancy Factor for RA-encoded sequences and also Studying Steganalysis Performance of YASS

Our recently introduced JPEG steganographic method called Yet Another Steganographic Scheme (YASS) can resist blind steganalysis by embedding data in the discrete cosine transform (DCT) domain in randomly chosen image blocks. To maximize the embedding rate for a given image and a specified attack channel, the redundancy factor used by the repeat-accumulate (RA) code based error correction frame...

متن کامل

The Investigation of Mutations and Comparison of Leptin Gene Pro-Motor in Najdi Cattle with the Database NCBI Sequences

Objective: Identity the genetic aspects and major gene influence on energy balance, milk production, fertility, food safety and consumer are the recent interests of genetic and breeding researchers. Methods: Najdi Cattle is the most prominent breeds in Khuzestan province. To do this plan in Shoushtar Najdi Cattle Station, blood samples were taken from 15 Najdi Cattles. DNA was extracted from wh...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003