Predicting Protein Folding Classes without Overly Relying on Homology

نویسندگان

  • Mark Craven
  • Richard J. Mural
  • Loren J. Hauser
  • Edward C. Uberbacher
چکیده

An important open problem in molecular biology is how to use computational methods to understand the structure and function of proteins given only their primary sequences. We describe and evaluate an original machine-learning approach to classifying protein sequences according to their structural folding class. Our work is novel in several respects: we use a set of protein classes that previously have not been used for classifying primary sequences, and we use a unique set of attributes to represent protein sequences to the learners. We evaluate our approach by measuring its ability to correctly classify proteins that were not in its training set. We compare our input representation to a commonly used input representation--amino acid composition--and show that our approach more accurately classifies proteins that have very limited homology to the sequences on which the systems are trained.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Probabilistic Constraint-based Approach to Protein Structure Predication

Protein folding is the process by which a protein structure assumes its functional shape or conformation. It is believed that the dynamical folding of a protein to its native conformation is determined by the amino acid sequence of the protein. Given the usefulness of known protein structures in such valuable tasks as rational drug design, protein structure prediction is a highly active field o...

متن کامل

Structural Characteristics of Stable Folding Intermediates of Yeast Iso-1-Cytochrome-c

Cytochrome-c (cyt-c) is an electron transport protein, and it is present throughout the evolution. More than 280 sequences have been reported in the protein sequence database (www.uniprot.org). Though sequentially diverse, cyt-c has essentially retained its tertiary structure or fold. Thus a vast data set of varied sequences with retention of similar structure and fun...

متن کامل

Mimicking the folding pathway to improve homology-free protein structure prediction.

Since the demonstration that the sequence of a protein encodes its structure, the prediction of structure from sequence remains an outstanding problem that impacts numerous scientific disciplines, including many genome projects. By iteratively fixing secondary structure assignments of residues during Monte Carlo simulations of folding, our coarse-grained model without information concerning hom...

متن کامل

Prediction of protein structural classes by a new measure of information discrepancy

Protein structural class describes the overall folding type of a protein or its domain. A number of methods were developed to predict protein structural class based on its primary sequence. The homology of the predicted sequences with respect to the training sequences is a key attribute for the prediction performance. In this article we investigated the FDOD method developed by Jin et al. [Jin,...

متن کامل

Protein Folding Prediction

The search for an accurate and efficient protein conformation predicting method started around 1972 with a theoretical study on the ribonuclease folding. Surprisingly, physicist, chemists, and mathematicians have only been making very minute progress towards the empirical prediction algorithm of three-dimensional protein folding structures. The three most commonly accepted methods used in the p...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Proceedings. International Conference on Intelligent Systems for Molecular Biology

دوره 3  شماره 

صفحات  -

تاریخ انتشار 1995