Building a Biased Least Squares Support Vector Machine Classifier for Positive and Unlabeled Learning

نویسندگان

  • Ting Ke
  • Lujia Song
  • Bing Yang
  • Xinbin Zhao
  • Ling Jing
چکیده

Learning from positive and unlabeled examples (PU learning) is a special case of semi-supervised binary classification. The key feature of PU learning is that there is no labeled negative training data, which makes the traditional classification techniques inapplicable. Similar to the idea of Biased-SVM which is one of the most famous classifier, a biased least squares support vector machine classifier (Biased-LSSVM) is proposed for PU learning in this paper. More specifically, we take unlabeled examples as negative examples with noise and build a least squares support vector machine classifier using two penalty parameters p C and n C to weight misclassification errors of positive and negative examples respectively. As we pay more attention to classify as many as positive examples correctly in PU learning, the relationship of parameters p C and n C is p n C C ≥ . Compared with Biased-SVM, the proposed classifier has three advantages. First, Biased-LSSVM can reflect the class labels of all examples more sufficiently and accurately than Biased-SVM. Second, Biased-LSSVM is more stable than Biased-SVM because the performance of Biased-LSSVM changes less than that of Biased-SVM over a wide ratio of positive examples in unlabeled examples. Finally, the time complexity of Biased-LSSVM is lower than that of Biased-SVM, where Biased-LSSVM only need to solve liner equations and Biased-SVM is a quadratic programming. The Experiments on two real applications, text classification and bioinformatics classification verify the above opinions and show that Biased-LSSVM is more effective than Biased-SVM and other popular methods, such as EB-SVM, ROC-SVM and S-EM.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Least Squares Support Vector Machine for Constitutive Modeling of Clay

Constitutive modeling of clay is an important research in geotechnical engineering. It is difficult to use precise mathematical expressions to approximate stress-strain relationship of clay. Artificial neural network (ANN) and support vector machine (SVM) have been successfully used in constitutive modeling of clay. However, generalization ability of ANN has some limitations, and application of...

متن کامل

Anomaly Detection Using SVM as Classifier and Decision Tree for Optimizing Feature Vectors

Abstract- With the advancement and development of computer network technologies, the way for intruders has become smoother; therefore, to detect threats and attacks, the importance of intrusion detection systems (IDS) as one of the key elements of security is increasing. One of the challenges of intrusion detection systems is managing of the large amount of network traffic features. Removing un...

متن کامل

Linear Manifold Regularization for Large Scale Semi-supervised Learning

The enormous wealth of unlabeled data in many applications of machine learning is beginning to pose challenges to the designers of semi-supervised learning methods. We are interested in developing linear classification algorithms to efficiently learn from massive partially labeled datasets. In this paper, we propose Linear Laplacian Support Vector Machines and Linear Laplacian Regularized Least...

متن کامل

Fault diagnosis in a distillation column using a support vector machine based classifier

Fault diagnosis has always been an essential aspect of control system design. This is necessary due to the growing demand for increased performance and safety of industrial systems is discussed. Support vector machine classifier is a new technique based on statistical learning theory and is designed to reduce structural bias. Support vector machine classification in many applications in v...

متن کامل

Least-squares support vector machine and its application in the simultaneous quantitative spectrophotometric determination of pharmaceutical ternary mixture

This paper proposes the least-squares support vector machine (LS-SVM) as an intelligent method applied on absorption spectra for the simultaneous determination of paracetamol (PCT), caffeine (CAF) and ibuprofen (IB) in Novafen. The signal to noise ratio (S/N) increased. Also, In the LS - SVM model, Kernel parameter (σ2) and capacity factor (C) were optimized. Excellent prediction was shown usin...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • JSW

دوره 9  شماره 

صفحات  -

تاریخ انتشار 2014