Combining Local and Global KNN With Cotraining

نویسندگان

  • Víctor Laguna
  • Alneu de Andrade Lopes
چکیده

Semi-supervised learning is a machine learning paradigm in which the induced hypothesis is improved by taking advantage of unlabeled data. It is particularly useful when labeled data is scarce. Cotraining is a widely adopted semi-supervised approach that assumes availability of two views of the training data a restrictive assumption for most real world tasks. In this paper, we propose a one-view Cotraining approach that combines two different k-Nearest Neighbors (KNN) strategies referred to as global and local k-NN. In global KNN, the nearest neighbors selected to classify a new instance are given by the training examples which include this instance as one of their own k-nearest neighbors. In local KNN, on the other hand, the neighborhood considered when classifying a new instance is computed with the traditional KNN approach. We carried out experiments showing that a combination of these strategies significantly improves the classification accuracy in Cotraining, particularly when one single view of training data is available. We also introduce an optimized algorithm to cope with time complexity of computing the global KNN, which enables tackling real classification problems.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Optimal Approach to Local and Global Text Coherence Evaluation Combining Entity-based, Graph-based and Entropy-based Approaches

Text coherence evaluation becomes a vital and lovely task in Natural Language Processing subfields, such as text summarization, question answering, text generation and machine translation. Existing methods like entity-based and graph-based models are engaging with nouns and noun phrases change role in sequential sentences within short part of a text. They even have limitations in global coheren...

متن کامل

Link Prediction using Network Embedding based on Global Similarity

Background: The link prediction issue is one of the most widely used problems in complex network analysis. Link prediction requires knowing the background of previous link connections and combining them with available information. The link prediction local approaches with node structure objectives are fast in case of speed but are not accurate enough. On the other hand, the global link predicti...

متن کامل

Combining Multiple and Classifiers for Increasing Accuracy for

In this paper we combining statistical, structural Global transformation and moments features to form hybrid feature vector .We are combining Classifiers for achieving high accuracy for Devanagari Script. To abolish the hitch of misclassification and increase the classifier accurac combining SVM and KNN together. The dataset used for experiment are created by us.

متن کامل

Geometric k-nearest neighbor estimation of entropy and mutual information

Nonparametric estimation of mutual information is used in a wide range of scientific problems to quantify dependence between variables. The k-nearest neighbor (knn) methods are consistent, and therefore expected to work well for a large sample size. These methods use geometrically regular local volume elements. This practice allows maximum localization of the volume elements, but can also induc...

متن کامل

Comparison of Four Methods for Premature Ventricular Contractions and Normal Beats Clustering

The learning capacity and the classification ability for normal beats and premature ventricular contractions clustering by four classification methods were compared: neural networks (NN), K-th nearest neighbour rule (Knn), discriminant analysis (DA) and fuzzy logic (FL). Twenty-six morphology feature parameters, which include information of amplitude, area, specific interval durations and measu...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010