Using Bagging classifier to predict protein domain structural class.

نویسندگان

  • Liuhuan Dong
  • Yuan Yuan
  • Yudong Cai
چکیده

Classification and prediction of protein domain structural class is one of the important topics in the molecular biology. We introduce the Bagging (Bootstrap aggregating), one of the bootstrap methods, for classifying and predicting protein structural classes. By a bootstrap aggregating procedure, the Bagging can improve a weak classifier, for instance the random tree method, to a significant step towards optimality. In this research, it is demonstrated that the Bagging performed at least as well as LogitBoost and Support vector machines in predicting the structural classes for a given protein domain dataset by 10 cross-validation test, which indicate that the Bagging method is promising and anticipated that it could be potentially further improved on predicting protein structural classes as well as other bio-macromolecular attributes, if the bagging method and other existing methods can be effectively complemented with each other.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Application of ensemble learning techniques to model the atmospheric concentration of SO2

In view of pollution prediction modeling, the study adopts homogenous (random forest, bagging, and additive regression) and heterogeneous (voting) ensemble classifiers to predict the atmospheric concentration of Sulphur dioxide. For model validation, results were compared against widely known single base classifiers such as support vector machine, multilayer perceptron, linear regression and re...

متن کامل

Boosting classifier for predicting protein domain structural class.

A novel classifier, the so-called "LogitBoost" classifier, was introduced to predict the structural class of a protein domain according to its amino acid sequence. LogitBoost is featured by introducing a log-likelihood loss function to reduce the sensitivity to noise and outliers, as well as by performing classification via combining many weak classifiers together to build up a very strong and ...

متن کامل

ارتقای کیفیت دسته‌بندی متون با استفاده از کمیته‌ دسته‌بند دو سطحی

Nowadays, the automated text classification has witnessed special importance due to the increasing availability of documents in digital form and ensuing need to organize them. Although this problem is in the Information Retrieval (IR) field, the dominant approach is based on machine learning techniques. Approaches based on classifier committees have shown a better performance than the others. I...

متن کامل

Improving reservoir rock classification in heterogeneous carbonates using boosting and bagging strategies: A case study of early Triassic carbonates of coastal Fars, south Iran

An accurate reservoir characterization is a crucial task for the development of quantitative geological models and reservoir simulation. In the present research work, a novel view is presented on the reservoir characterization using the advantages of thin section image analysis and intelligent classification algorithms. The proposed methodology comprises three main steps. First, four classes of...

متن کامل

An Investigation of Sensitivity on Bagging Predictors: An Empirical Approach

As growing numbers of real world applications involve imbalanced class distribution or unequal costs for misclassification errors in different classes, learning from imbalanced class distribution is considered to be one of the most challenging issues in data mining research. This study empirically investigates the sensitivity of bagging predictors with respect to 12 algorithms and 9 levels of c...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Journal of biomolecular structure & dynamics

دوره 24 3  شماره 

صفحات  -

تاریخ انتشار 2006