Using sampling methods to improve binding site predictions
نویسندگان
چکیده
Currently the best algorithms for transcription factor binding site prediction are severely limited in accuracy. In previous work we combine random selection under-sampling into SMOTE over-sampling technique, working with several classification algorithms from machine learning field to integrate binding site predictions. In this paper, we improve the classification result with the aid of Tomek links as an either undersampling or cleaning technique.
منابع مشابه
Using pre & post-processing methods to improve binding site predictions
Currently the best algorithms for transcription factor binding site prediction within sequences of regulatory DNA are severely limited in accuracy. In this paper, we integrate 12 original binding site prediction algorithms, and use a ‘window’ of consecutive predictions in order to contextualise the neighbouring results. We combine either random selection or Tomek links under-sampling with SMOTE...
متن کاملIntegrating binding site predictions using meta classification methods
Currently the best algorithms for transcription factor binding site prediction are severely limited in accuracy. There is good reason to believe that predictions from these different classes of algorithms could be used in conjunction to improve the quality of predictions. In this paper, we apply single layer networks and support vector machines on predictions from key algorithms. Furthermore, w...
متن کاملUsing Real-Valued Meta Classifiers to Integrate and Contextualize Binding Site Predictions
Currently the best algorithms for transcription factor binding site predictions are severely limited in accuracy. However, a non-linear combination of these algorithms could improve the quality of predictions. A support-vector machine was applied to combine the predictions of 12 key real valued algorithms. The data was divided into a training set and a test set, of which two were constructed: f...
متن کاملGalaxySite: ligand-binding-site prediction by using molecular docking
Knowledge of ligand-binding sites of proteins provides invaluable information for functional studies, drug design and protein design. Recent progress in ligand-binding-site prediction methods has demonstrated that using information from similar proteins of known structures can improve predictions. The GalaxySite web server, freely accessible at http://galaxy.seoklab.org/site, combines such info...
متن کاملEffect of Using Varying Negative Examples in Transcription Factor Binding Site Predictions
Background: Identifying transcription factor binding sites (TFBSs) computationally is a hard problem as it produces many false predictions. Combining the predictions from existing predictors can improve the overall predictions by using classification methods like Support Vector Machines (SVMs). But conventional negative examples (that is, example which is the part of non-binding sites) in this ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2006