Handling missing values in support vector machine classifiers

نویسندگان

  • Kristiaan Pelckmans
  • Jos De Brabanter
  • Johan A. K. Suykens
  • Bart De Moor
چکیده

This paper discusses the task of learning a classifier from observed data containing missing values amongst the inputs which are missing completely at random. A non-parametric perspective is adopted by defining a modified risk taking into account the uncertainty of the predicted outputs when missing values are involved. It is shown that this approach generalizes the approach of mean imputation in the linear case and the resulting kernel machine reduces to the standard Support Vector Machine (SVM) when no input values are missing. Furthermore, the method is extended to the multivariate case of fitting additive models using componentwise kernel machines, and an efficient implementation is based on the Least Squares Support Vector Machine (LS-SVM) classifier formulation.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Support Vector Machine Based Facies Classification Using Seismic Attributes in an Oil Field of Iran

Seismic facies analysis (SFA) aims to classify similar seismic traces based on amplitude, phase, frequency, and other seismic attributes. SFA has proven useful in interpreting seismic data, allowing significant information on subsurface geological structures to be extracted. While facies analysis has been widely investigated through unsupervised-classification-based studies, there are few cases...

متن کامل

Improving Naive Bayesian Classifier by Discriminative Training

Discriminative classifiers such as Support Vector Machines (SVM) directly learn a discriminant function or a posterior probability model to perform classification. On the other hand, generative classifiers often learn a joint probability model and then use the Bayes rule to construct a posterior classifier. In general, generative classifiers are not as accurate as discriminative classifiers. Ho...

متن کامل

Application of ensemble learning techniques to model the atmospheric concentration of SO2

In view of pollution prediction modeling, the study adopts homogenous (random forest, bagging, and additive regression) and heterogeneous (voting) ensemble classifiers to predict the atmospheric concentration of Sulphur dioxide. For model validation, results were compared against widely known single base classifiers such as support vector machine, multilayer perceptron, linear regression and re...

متن کامل

Communal Neural Network for Ovarian Cancer Mutation Classification

Microarrays are being used to express thousands of genes at a time which is helpful to diagnose and cure many diseases with higher accuracy using diagnostic classifiers. However, 90% of the time gene expression datasets contain multiple missing values because of slide scratches, hybridization error, image corruption and etc. These missing values affect classifiers accuracy as most of the classi...

متن کامل

SUBCLASS FUZZY-SVM CLASSIFIER AS AN EFFICIENT METHOD TO ENHANCE THE MASS DETECTION IN MAMMOGRAMS

This paper is concerned with the development of a novel classifier for automatic mass detection of mammograms, based on contourlet feature extraction in conjunction with statistical and fuzzy classifiers. In this method, mammograms are segmented into regions of interest (ROI) in order to extract features including geometrical and contourlet coefficients. The extracted features benefit from...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Neural networks : the official journal of the International Neural Network Society

دوره 18 5-6  شماره 

صفحات  -

تاریخ انتشار 2005