Data Sparseness in Linear SVM

نویسندگان

  • Xiang Li
  • Huaimin Wang
  • Bin Gu
  • Charles X. Ling
چکیده

Large sparse datasets are common in many realworld applications. Linear SVM has been shown to be very efficient for classifying such datasets. However, it is still unknown how data sparseness would affect its convergence behavior. To study this problem in a systematic manner, we propose a novel approach to generate large and sparse data from real-world datasets, using statistical inference and the data sampling process in the PAC framework. We first study the convergence behavior of linear SVM experimentally, and make several observations, useful for real-world applications. We then offer theoretical proofs for our observations by studying the Bayes risk and PAC bound. Our experiment and theoretic results are valuable for learning large sparse datasets with linear SVM.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Sparse Least Squares Support Vector Machine Classiiers

In least squares support vector machine (LS-SVM) classi-ers the original SVM formulation of Vapnik is modiied by considering equality constraints within a form of ridge regression instead of inequality constraints. As a result the solution follows from solving a set of linear equations instead of a quadratic programming problem. However, a drawback is that sparseness is lost in the LS-SVM case ...

متن کامل

Sparse least squares Support Vector Machine classifiers

In least squares support vector machine (LS-SVM) classi-ers the original SVM formulation of Vapnik is modiied by considering equalit y constraints within a form of ridge regression instead of inequality constraints. As a result the solution follows from solving a set of linear equations instead of a quadratic programming problem. Ho wever, a d r a wback is that sparseness is lost in the LS-SVM ...

متن کامل

A Weighted Generalized Ls–svm

Neural networks play an important role in system modelling. This is especially true if model building is mainly based on observed data. Among neural models the Support Vector Machine (SVM) solutions are attracting increasing attention, mostly because they automatically answer certain crucial questions involved by neural network construction. They derive an ‘optimal’ network structure and answer...

متن کامل

3D gravity data-space inversion with sparseness and bound constraints

One of the most remarkable basis of the gravity data inversion is the recognition of sharp boundaries between an ore body and its host rocks during the interpretation step. Therefore, in this work, it is attempted to develop an inversion approach to determine a 3D density distribution that produces a given gravity anomaly. The subsurface model consists of a 3D rectangular prisms of known sizes ...

متن کامل

A Robust LS-SVM Regression

In comparison to the original SVM, which involves a quadratic programming task; LS–SVM simplifies the required computation, but unfortunately the sparseness of standard SVM is lost. Another problem is that LS-SVM is only optimal if the training samples are corrupted by Gaussian noise. In Least Squares SVM (LS–SVM), the nonlinear solution is obtained, by first mapping the input vector to a high ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015