Handling missing values in support vector machine classifiers
نویسندگان
چکیده
This paper discusses the task of learning a classifier from observed data containing missing values amongst the inputs which are missing completely at random. A non-parametric perspective is adopted by defining a modified risk taking into account the uncertainty of the predicted outputs when missing values are involved. It is shown that this approach generalizes the approach of mean imputation in the linear case and the resulting kernel machine reduces to the standard Support Vector Machine (SVM) when no input values are missing. Furthermore, the method is extended to the multivariate case of fitting additive models using componentwise kernel machines, and an efficient implementation is based on the Least Squares Support Vector Machine (LS-SVM) classifier formulation.
منابع مشابه
Support Vector Machine Based Facies Classification Using Seismic Attributes in an Oil Field of Iran
Seismic facies analysis (SFA) aims to classify similar seismic traces based on amplitude, phase, frequency, and other seismic attributes. SFA has proven useful in interpreting seismic data, allowing significant information on subsurface geological structures to be extracted. While facies analysis has been widely investigated through unsupervised-classification-based studies, there are few cases...
متن کاملImproving Naive Bayesian Classifier by Discriminative Training
Discriminative classifiers such as Support Vector Machines (SVM) directly learn a discriminant function or a posterior probability model to perform classification. On the other hand, generative classifiers often learn a joint probability model and then use the Bayes rule to construct a posterior classifier. In general, generative classifiers are not as accurate as discriminative classifiers. Ho...
متن کاملApplication of ensemble learning techniques to model the atmospheric concentration of SO2
In view of pollution prediction modeling, the study adopts homogenous (random forest, bagging, and additive regression) and heterogeneous (voting) ensemble classifiers to predict the atmospheric concentration of Sulphur dioxide. For model validation, results were compared against widely known single base classifiers such as support vector machine, multilayer perceptron, linear regression and re...
متن کاملCommunal Neural Network for Ovarian Cancer Mutation Classification
Microarrays are being used to express thousands of genes at a time which is helpful to diagnose and cure many diseases with higher accuracy using diagnostic classifiers. However, 90% of the time gene expression datasets contain multiple missing values because of slide scratches, hybridization error, image corruption and etc. These missing values affect classifiers accuracy as most of the classi...
متن کاملSUBCLASS FUZZY-SVM CLASSIFIER AS AN EFFICIENT METHOD TO ENHANCE THE MASS DETECTION IN MAMMOGRAMS
This paper is concerned with the development of a novel classifier for automatic mass detection of mammograms, based on contourlet feature extraction in conjunction with statistical and fuzzy classifiers. In this method, mammograms are segmented into regions of interest (ROI) in order to extract features including geometrical and contourlet coefficients. The extracted features benefit from...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Neural networks : the official journal of the International Neural Network Society
دوره 18 5-6 شماره
صفحات -
تاریخ انتشار 2005