Evaluating the efficacy of a structure-derived amino acid substitution matrix in detecting protein homologs by BLAST and PSI-BLAST

نویسنده

  • Nalin CW Goonesekere
چکیده

The large numbers of protein sequences generated by whole genome sequencing projects require rapid and accurate methods of annotation. The detection of homology through computational sequence analysis is a powerful tool in determining the complex evolutionary and functional relationships that exist between proteins. Homology search algorithms employ amino acid substitution matrices to detect similarity between proteins sequences. The substitution matrices in common use today are constructed using sequences aligned without reference to protein structure. Here we present amino acid substitution matrices constructed from the alignment of a large number of protein domain structures from the structural classification of proteins (SCOP) database. We show that when incorporated into the homology search algorithms BLAST and PSI-blast, the structure-based substitution matrices enhance the efficacy of detecting remote homologs.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Weak Homology Detection by Profile-Profile Comparison

Experimentally determined protein tertiary structures are currently increasing in an enormous rate. The growth of the structure database makes structure prediction methods based on known tertiary structures more efficient. Detecting weak homology is crucial to extend applicable area of these methods. For this purpose, a profile or PSSM (position-specific score matrix) derived from multiple alig...

متن کامل

Discriminative modelling of context-specific amino acid substitution probabilities

MOTIVATION Protein sequence searching and alignment are fundamental tools of modern biology. Alignments are assessed using their similarity scores, essentially the sum of substitution matrix scores over all pairs of aligned amino acids. We previously proposed a generative probabilistic method that yields scores that take the sequence context around each aligned residue into account. This method...

متن کامل

A Boost for Sequence Searching Extension to BLAST Can Improve its Sensitivity Twofold

descended from a common ancestor, usually not only have similar sequences but also similar structures and functions. Hence, when two sequences are similar to a degree that cannot be explained by chance, we can assume that the similarity in sequence arose by common descent and that the proteins are therefore likely to be structurally and functionally similar. This principle of “homology-based in...

متن کامل

PISCES: a protein sequence culling server

PISCES is a public server for culling sets of protein sequences from the Protein Data Bank (PDB) by sequence identity and structural quality criteria. PISCES can provide lists culled from the entire PDB or from lists of PDB entries or chains provided by the user. The sequence identities are obtained from PSI-BLAST alignments with position-specific substitution matrices derived from the non-redu...

متن کامل

Alternative approach to protein structure prediction based on sequential similarity of physical properties.

The relationship between protein sequence and structure arises entirely from amino acid physical properties. An alternative method is therefore proposed to identify homologs in which residue equivalence is based exclusively on the pairwise physical property similarities of sequences. This approach, the property factor method (PFM), is entirely different from those in current use. A comparison i...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره 2  شماره 

صفحات  -

تاریخ انتشار 2009