k*-Nearest Neighbors: From Global to Local

نویسندگان

  • Oren Anava
  • Kfir Y. Levy
چکیده

The weighted k-nearest neighbors algorithm is one of the most fundamental nonparametric methods in pattern recognition and machine learning. The question of setting the optimal number of neighbors as well as the optimal weights has received much attention throughout the years, nevertheless this problem seems to have remained unsettled. In this paper we offer a simple approach to locally weighted regression/classification, where we make the bias-variance tradeoff explicit. Our formulation enables us to phrase a notion of optimal weights, and to efficiently find these weights as well as the optimal number of neighbors efficiently and adaptively, for each data point whose value we wish to estimate. The applicability of our approach is demonstrated on several datasets, showing superior performance over standard locally weighted methods.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

K Nearest Neighbor Classification with Local Induction of the Simple Value Difference Metric

The classical k nearest neighbor (k-nn) classification assumes that a fixed global metric is defined and searching for nearest neighbors is always based on this global metric. In the paper we present a model with local induction of a metric. Any test object induces a local metric from the neighborhood of this object and selects k nearest neighbors according to this locally induced metric. To in...

متن کامل

A Novel Hybrid Approach for Email Spam Detection based on Scatter Search Algorithm and K-Nearest Neighbors

Because cyberspace and Internet predominate in the life of users, in addition to business opportunities and time reductions, threats like information theft, penetration into systems, etc. are included in the field of hardware and software. Security is the top priority to prevent a cyber-attack that users should initially be detecting the type of attacks because virtual environments are not moni...

متن کامل

A comparative study of performance of K-nearest neighbors and support vector machines for classification of groundwater

The aim of this work is to examine the feasibilities of the support vector machines (SVMs) and K-nearest neighbor (K-NN) classifier methods for the classification of an aquifer in the Khuzestan Province, Iran. For this purpose, 17 groundwater quality variables including EC, TDS, turbidity, pH, total hardness, Ca, Mg, total alkalinity, sulfate, nitrate, nitrite, fluoride, phosphate, Fe, Mn, Cu, ...

متن کامل

Combining Local and Global KNN With Cotraining

Semi-supervised learning is a machine learning paradigm in which the induced hypothesis is improved by taking advantage of unlabeled data. It is particularly useful when labeled data is scarce. Cotraining is a widely adopted semi-supervised approach that assumes availability of two views of the training data a restrictive assumption for most real world tasks. In this paper, we propose a one-vie...

متن کامل

Local and global scaling reduce hubs in space

Hubness’ has recently been identified as a general problem of high dimensional data spaces, manifesting itself in the emergence of objects, so-called hubs, which tend to be among the k nearest neighbors of a large number of data items. As a consequence many nearest neighbor relations in the distance space are asymmetric, that is, object y is amongst the nearest neighbors of x but not vice versa...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016