Protein function annotation from sequence: prediction of residues interacting with RNA
نویسندگان
چکیده
MOTIVATION All eukaryotic proteomes are characterized by a significant percentage of proteins of unknown function. Comp-utational function prediction methods are therefore essential as initial steps in the function annotation process. This article describes an annotation method (PiRaNhA) for the prediction of RNA-binding residues (RBRs) from protein sequence information. A series of sequence properties (position specific scoring matrices, interface propensities, predicted accessibility and hydrophobicity) are used to train a support vector machine. This method is then evaluated for its potential to be applied to RNA-binding function prediction at the level of the complete protein. RESULTS The 5-fold cross-validation of PiRaNhA on a dataset of 81 RNA-binding proteins achieves a Matthews Correlation Coefficient (MCC) of 0.50 and accuracy of 87.2%. When used to predict RBRs in 42 proteins not used in training, PiRaNhA achieves an MCC of 0.41 and accuracy of 84.5%. Decision values from the PiRaNhA predictions were used in a second SVM to make predictions of RNA-binding function at the protein level, achieving an MCC of 0.53 and accuracy of 76.1%. The PiRaNhA RBR predictions allow experimentalists to perform more targeted experiments for function annotation; and the prediction of RNA-binding function at the protein level shows promise for proteome-wide annotations. AVAILABILITY AND IMPLEMENTATION Freely available on the web at www.bioinformatics.sussex.ac.uk/PIRANHA or http://piranha.protein.osaka-u.ac.jp. SUPPLEMENTARY INFORMATION Supplementary data are available at the Bioinformatics online.
منابع مشابه
Prediction of protein-RNA binding sites by a random forest method with combined features
MOTIVATION Protein-RNA interactions play a key role in a number of biological processes, such as protein synthesis, mRNA processing, mRNA assembly, ribosome function and eukaryotic spliceosomes. As a result, a reliable identification of RNA binding site of a protein is important for functional annotation and site-directed mutagenesis. Accumulated data of experimental protein-RNA interactions re...
متن کاملConFunc - functional annotation in the twilight zone
MOTIVATION The success of genome sequencing has resulted in many protein sequences without functional annotation. We present ConFunc, an automated Gene Ontology (GO)-based protein function prediction approach, which uses conserved residues to generate sequence profiles to infer function. ConFunc split sets of sequences identified by PSI-BLAST into sub-alignments according to their GO annotation...
متن کاملCharacterization of cDNA sequence encoding for a novel sodium channel -toxin from the Iranian scorpion Mesobuthus eupeus venom glands
The venoms of Buthidae scorpions are known to contain basic, single-chain protein -toxins consisting of 60-70 amino acid residues that are tightly cross-linked by four disulfide bridges. Total RNA was extracted from the venom glands of scorpion Mesobuthus eupeus collected from the Khuzestan province of Iran and then cDNA was synthesized with the modified oligo (dT) primer and extracted total R...
متن کاملEvaluation of features for catalytic residue prediction in novel folds.
Structural genomics projects are determining the three-dimensional structure of proteins without full characterization of their function. A critical part of the annotation process involves appropriate knowledge representation and prediction of functionally important residue environments. We have developed a method to extract features from sequence, sequence alignments, three-dimensional structu...
متن کاملAutomated structure-based prediction of functional sites in proteins: applications to assessing the validity of inheriting protein function from homology in genome annotation and to protein docking.
A major problem in genome annotation is whether it is valid to transfer the function from a characterised protein to a homologue of unknown activity. Here, we show that one can employ a strategy that uses a structure-based prediction of protein functional sites to assess the reliability of functional inheritance. We have automated and benchmarked a method based on the evolutionary trace approac...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Bioinformatics
دوره 25 12 شماره
صفحات -
تاریخ انتشار 2009