Fast Nearest Neighbors

نویسنده

  • THOMAS KOLLAR
چکیده

We present a review of the literature on fast nearest neighbors using the basic approach from Karger and Ruhl [4] and a recent technique called cover trees. A small error in Insert procedure from the original paper on cover trees is corrected and an examination of how query time actually varies with the size of the problem is shown using a Python implementation of the basic cover tree algorithms.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Evaluation of Fast K-nearest Neighbors Search Methods Using Real Data Sets

The problem of k-nearest neighbors (kNN) search is to find nearest k neighbors from a given data set for a query point. To speed up the finding process of nearest k neighbors, many fast kNN search algorithms were proposed. The performance of fast kNN search algorithms is highly influenced by the number of dimensions, number of data points, and data distribution of a data set. In the extreme cas...

متن کامل

Fast Large-Scale Approximate Graph Construction for NLP

Many natural language processing problems involve constructing large nearest-neighbor graphs. We propose a system called FLAG to construct such graphs approximately from large data sets. To handle the large amount of data, our algorithm maintains approximate counts based on sketching algorithms. To find the approximate nearest neighbors, our algorithm pairs a new distributed online-PMI algorith...

متن کامل

Fast k-means based on KNN Graph

In the era of big data, k-means clustering has been widely adopted as a basic processing tool in various contexts. However, its computational cost could be prohibitively high as the data size and the cluster number are large. It is well known that the processing bottleneck of k-means lies in the operation of seeking closest centroid in each iteration. In this paper, a novel solution towards the...

متن کامل

A Novel Hybrid Approach for Email Spam Detection based on Scatter Search Algorithm and K-Nearest Neighbors

Because cyberspace and Internet predominate in the life of users, in addition to business opportunities and time reductions, threats like information theft, penetration into systems, etc. are included in the field of hardware and software. Security is the top priority to prevent a cyber-attack that users should initially be detecting the type of attacks because virtual environments are not moni...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006