Spatial profiling of protein hydrophobicity: native vs. decoy structures.
نویسندگان
چکیده
A recent study of 30 soluble globular protein structures revealed a quasi-invariant called the hydrophobic ratio. This invariant, which is the ratio of the distance at which the second order hydrophobic moment vanished to the distance at which the zero order moment vanished, was found to be 0.75 +/- 0.05 for 30 protein structures. This report first describes the results of the hydrophobic profiling of 5,387 non-redundant globular protein domains of the Protein Data Bank, which yields a hydrophobic ratio of 0.71 +/- 0.08. Then, a new hydrophobic score is defined based on the hydrophobic profiling to discriminate native-like proteins from decoy structures. This is tested on three widely used decoy sets, namely the Holm and Sander decoys, Park and Levitt decoys, and Baker decoys. Since the hydrophobic moment profiling characterizes a global feature and requires reasonably good statistics, this imposes a constraint upon the size of the protein structures in order to yield relatively smooth moment profiles. We show that even subject to the limitations of protein size (both Park & Levitt and Baker sets are small protein decoys), the hydrophobic moment profiling and hydrophobic score can provide useful information that should be complementary to the information provided by force field calculations.
منابع مشابه
Hydrophobic moments of protein structures: spatially profiling the distribution.
It is generally accepted that globular proteins fold with a hydrophobic core and a hydrophilic exterior. Might the spatial distribution of amino acid hydrophobicity exhibit common features? The hydrophobic profile detailing this distribution from the protein interior to exterior has been examined for 30 relatively diverse structures obtained from the Protein Data Bank, for 3 proteins of the 30S...
متن کاملProtein structure quality assessment based on the distance profiles of consecutive backbone Cα atoms
Predicting the three dimensional native state structure of a protein from its primary sequence is an unsolved grand challenge in molecular biology. Two main computational approaches have evolved to obtain the structure from the protein sequence - ab initio/de novo methods and template-based modeling - both of which typically generate multiple possible native state structures. Model quality asse...
متن کاملUsing machine learning for decoy discrimination in protein tertiary structure prediction
In this thesis, the novelty of using machine learning to identify the low-RMSD struc tures in decoy discrimination in protein tertiary structure prediction is investigated. More specifically, neural networks are used to learn to recognize low-RMSD struc tures, using native protein structures as positive training examples, and simulated decoy structures as negative training examples. Simulated...
متن کاملDecoy Database Improvement for Protein Folding
Predicting protein structures and simulating protein folding are two of the most important problems in computational biology today. Simulation methods rely on a scoring function to distinguish the native structure (the most energetically stable) from non-native structures. Decoy databases are collections of non-native structures used to test and verify these functions. We present a method to ev...
متن کاملAn improved protein decoy set for testing energy functions for protein structure prediction.
We have improved the original Rosetta centroid/backbone decoy set by increasing the number of proteins and frequency of near native models and by building on sidechains and minimizing clashes. The new set consists of 1,400 model structures for 78 different and diverse protein targets and provides a challenging set for the testing and evaluation of scoring functions. We evaluated the extent to w...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Proteins
دوره 52 4 شماره
صفحات -
تاریخ انتشار 2003