Predicting Protein Folding Classes without Overly Relying on Homology
نویسندگان
چکیده
An important open problem in molecular biology is how to use computational methods to understand the structure and function of proteins given only their primary sequences. We describe and evaluate an original machine-learning approach to classifying protein sequences according to their structural folding class. Our work is novel in several respects: we use a set of protein classes that previously have not been used for classifying primary sequences, and we use a unique set of attributes to represent protein sequences to the learners. We evaluate our approach by measuring its ability to correctly classify proteins that were not in its training set. We compare our input representation to a commonly used input representation--amino acid composition--and show that our approach more accurately classifies proteins that have very limited homology to the sequences on which the systems are trained.
منابع مشابه
A Probabilistic Constraint-based Approach to Protein Structure Predication
Protein folding is the process by which a protein structure assumes its functional shape or conformation. It is believed that the dynamical folding of a protein to its native conformation is determined by the amino acid sequence of the protein. Given the usefulness of known protein structures in such valuable tasks as rational drug design, protein structure prediction is a highly active field o...
متن کاملStructural Characteristics of Stable Folding Intermediates of Yeast Iso-1-Cytochrome-c
Cytochrome-c (cyt-c) is an electron transport protein, and it is present throughout the evolution. More than 280 sequences have been reported in the protein sequence database (www.uniprot.org). Though sequentially diverse, cyt-c has essentially retained its tertiary structure or fold. Thus a vast data set of varied sequences with retention of similar structure and fun...
متن کاملMimicking the folding pathway to improve homology-free protein structure prediction.
Since the demonstration that the sequence of a protein encodes its structure, the prediction of structure from sequence remains an outstanding problem that impacts numerous scientific disciplines, including many genome projects. By iteratively fixing secondary structure assignments of residues during Monte Carlo simulations of folding, our coarse-grained model without information concerning hom...
متن کاملPrediction of protein structural classes by a new measure of information discrepancy
Protein structural class describes the overall folding type of a protein or its domain. A number of methods were developed to predict protein structural class based on its primary sequence. The homology of the predicted sequences with respect to the training sequences is a key attribute for the prediction performance. In this article we investigated the FDOD method developed by Jin et al. [Jin,...
متن کاملProtein Folding Prediction
The search for an accurate and efficient protein conformation predicting method started around 1972 with a theoretical study on the ribonuclease folding. Surprisingly, physicist, chemists, and mathematicians have only been making very minute progress towards the empirical prediction algorithm of three-dimensional protein folding structures. The three most commonly accepted methods used in the p...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Proceedings. International Conference on Intelligent Systems for Molecular Biology
دوره 3 شماره
صفحات -
تاریخ انتشار 1995