Query-Based Improvement Procedure and Self-Adaptive Graph Construction Algorithm for Approximate Nearest Neighbor Search

نویسنده

  • Alexander Ponomarenko
چکیده

The nearest neighbor search problem is well known since 60s. Many approaches have been proposed. One is to build a graph over the set of objects from a given database and use a greedy walk as a basis for a search algorithm. If the greedy walk has an ability to find the nearest neighbor in the graph starting from any vertex with a small number of steps, such a graph is called a navigable small world. In this paper we propose a new algorithm for building graphs with navigable small world properties. The main advantage of the proposed algorithm is that it is free from input parameters and has an ability to adapt on the fly to any changes in the distribution of data. The algorithm is based on the idea of removing local minimums by adding new edges. We realize this idea to improve search properties of the structure by using the set of queries in the execution stage. An empirical study of the proposed algorithm and comparison with previous works are reported in the paper.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

EFANNA : An Extremely Fast Approximate Nearest Neighbor Search Algorithm Based on kNN Graph

Approximate nearest neighbor (ANN) search is a fundamental problem in many areas of data mining, machine learning and computer vision. The performance of traditional hierarchical structure (tree) based methods decreases as the dimensionality of data grows, while hashing based methods usually lack efficiency in practice. Recently, the graph based methods have drawn considerable attention. The ma...

متن کامل

Pruned Bi-directed K-nearest Neighbor Graph for Proximity Search

In this paper, we address the problems with fast proximity searches for high-dimensional data by using a graph as an index. Graphbased methods that use the k-nearest neighbor graph (KNNG) as an index perform better than tree-based and hash-based methods in terms of search precision and query time. To further improve the performance of the KNNG, the number of edges should be increased. However, ...

متن کامل

Fast Large-Scale Approximate Graph Construction for NLP

Many natural language processing problems involve constructing large nearest-neighbor graphs. We propose a system called FLAG to construct such graphs approximately from large data sets. To handle the large amount of data, our algorithm maintains approximate counts based on sketching algorithms. To find the approximate nearest neighbors, our algorithm pairs a new distributed online-PMI algorith...

متن کامل

Predictive Indexing for Fast Search

We tackle the computational problem of query-conditioned search. Given a machine-learned scoring rule and a query distribution, we build a predictive index by precomputing lists of potential results sorted based on an expected score of the result over future queries. The predictive index datastructure supports an anytime algorithm for approximate retrieval of the top elements. The general appro...

متن کامل

Graph-based time-space trade-offs for approximate near neighbors

We take a first step towards a rigorous asymptotic analysis of graph-based approaches for finding (approximate) nearest neighbors in high-dimensional spaces, by analyzing the complexity of (randomized) greedy walks on the approximate near neighbor graph. For random data sets of size n = 2o(d) on the d-dimensional Euclidean unit sphere, using near neighbor graphs we can provably solve the approx...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015