Predicting membrane protein type by functional domain composition and pseudo-amino acid composition.

نویسندگان

  • Yu-Dong Cai
  • Kuo-Chen Chou
چکیده

Given the sequence of a protein, how can we predict whether it is a membrane protein or non-membrane protein? If it is, what membrane protein type it belongs to? Since these questions are closely relevant to the function of an uncharacterized protein, their importance is self-evident. Particularly, with the explosion of protein sequences entering into databanks and the relatively much slower progress in using biochemical experiments to determine their functions, it is highly desired to develop an automated method that can be used to give a fast answers to these questions. By hybridizing the functional domain (FunD) and pseudo-amino acid composition (PseAA), a new strategy called FunD-PseAA predictor was introduced. To test the power of the predictor, a highly non-homologous data set was constructed where none of proteins has 25% sequence identity to any other. The overall success rates obtained with the FunD-PseAA predictor on such a data set by the jackknife cross-validation test was 85% for the case in identifying membrane protein and non-membrane protein, and 91% in identifying the membrane protein type among the following 5 categories: (1) type-1 membrane protein, (2) type-2 membrane protein, (3) multipass transmembrane protein, (4) lipid chain-anchored membrane protein, and (5) GPI-anchored membrane protein. These rates are much higher than those obtained by the other methods on the same stringent data set, indicating that the FunD-PseAA predictor may become a useful high throughput tool in bioinformatics and proteomics.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Support vector machines for predicting membrane protein types by using functional domain composition.

Membrane proteins are generally classified into the following five types: 1), type I membrane protein; 2), type II membrane protein; 3), multipass transmembrane proteins; 4), lipid chain-anchored membrane proteins; and 5), GPI-anchored membrane proteins. In this article, based on the concept of using the functional domain composition to define a protein, the Support Vector Machine algorithm is ...

متن کامل

Using grey dynamic modeling and pseudo amino acid composition to predict protein structural classes

Using the pseudo amino acid (PseAA) composition to represent the sample of a protein can incorporate a considerable amount of sequence pattern information so as to improve the prediction quality for its structural or functional classification. However, how to optimally formulate the PseAA composition is an important problem yet to be solved. In this article the grey modeling approach is introdu...

متن کامل

Comparison the functional properties of protein Hydrolysates from poultry byproducts and rainbow trout

Poultry by-products and rainbow trout (Onchorhynchus mykiss) viscera are abundant and underutilized resources that can be used as a unique protein source to make protein hydrolysates. In this study protein hydrolysate were made from these two different sources with Alcalase 2.4L. The functional properties of Fish viscera protein hydrolysate (FPH) compared to poultry by-products protein hydrolys...

متن کامل

Predicting Protein Functional Class with the Weighted Segmented Pseudo-Amino Acid Composition Moment Vector

Predicting protein function at the proteomic-scale is one of the fundamental goals in cell biology and proteomics. In this paper, we proposed a new method for characterizing protein sequences—the Weighted Segmented Pseudo-amino acid composition Moment Vector (W-SPsAA-MV). From protein sequences, the encoding method of W-SPsAA-MV is applied to protein functional class prediction associated with ...

متن کامل

PseAAC-General: Fast Building Various Modes of General Form of Chou’s Pseudo-Amino Acid Composition for Large-Scale Protein Datasets

The general form pseudo-amino acid composition (PseAAC) has been widely used to represent protein sequences in predicting protein structural and functional attributes. We developed the program PseAAC-General to generate various different modes of Chou's general PseAAC, such as the gene ontology mode, the functional domain mode, and the sequential evolution mode. This program allows the users to...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Journal of theoretical biology

دوره 238 2  شماره 

صفحات  -

تاریخ انتشار 2006