SVM Kernel Optimization: An Example in Yeast Protein Subcellular Localization Prediction
نویسنده
چکیده
Localization of proteins, a flourishing area in bioinformatics, can help us understand their respective functions. Currently there exist a number of localization approaches based on machine learning algorithms, and support vector machines (SVMs) have been used extensively. However, in terms of kernel optimization, a critical step in SVM design, there is no well-established systematic method so far. In this paper, we apply the Levenberg-Marquardt (LM) algorithm in kernel optimization, an algorithm that we believe will have better optimization performance than simple gradient descent and that hasn’t been used in this field according to our knowledge, and test it using protein location data from yeast. We then experimentally compare the performance of our optimized system with others. Results show that automated parameter optimization can improve classification accuracy. Further research can complement such a method by controlling over-fitting.
منابع مشابه
NetLoc: Network Based Protein Localization Prediction Using Protein-Protein Interaction, Genetic Interaction, and Co-expression Networks
Recent study shows that protein-protein interaction network based features can significantly improve the prediction of protein subcellular localization. However, it is unclear whether network prediction models or other types of protein-protein correlation networks would also improve localization prediction. We present NetLoc, a novel network based algorithm for predicting protein subcellular lo...
متن کاملNetwork based prediction of protein localisation using diffusion Kernel
We present NetLoc, a novel diffusion Kernel-based Logistic Regression (KLR) algorithm for predicting protein subcellular localisation using four types of protein networks including physical PPI networks, genetic Protein-Protein Interaction (PPI) networks, mixed PPI networks and co-expression networks. NetLoc is applied to yeast protein localisation prediction. The results showed that protein ne...
متن کاملKernel-Based Data Fusion and Its Application to Protein Function Prediction in Yeast
Kernel methods provide a principled framework in which to represent many types of data, including vectors, strings, trees and graphs. As such, these methods are useful for drawing inferences about biological phenomena. We describe a method for combining multiple kernel representations in an optimal fashion, by formulating the problem as a convex optimization problem that can be solved using sem...
متن کاملPrediction of protein subcellular localization.
Because the protein's function is usually related to its subcellular localization, the ability to predict subcellular localization directly from protein sequences will be useful for inferring protein functions. Recent years have seen a surging interest in the development of novel computational tools to predict subcellular localization. At present, these approaches, based on a wide range of algo...
متن کاملImprovement of PSORT II Protein Sorting Prediction for Mammalian Proteins
The PSORT system [8] is a unique tool for the prediction of protein subcellular localization in a sense that it can deal with proteins localized at almost all the subcellular compartments. In its several versions, PSORT II [5] was developed for the prediction of eukaryotic proteins using yeast sequences as its training data. The reason why the data from a single species were used was that train...
متن کامل