Approximate Nearest Neighbors Search Without False Negatives For l_2 For c>sqrt{loglog{n}}
نویسندگان
چکیده
In this paper, we report progress on answering the open problem presented by Pagh [11], who considered the nearest neighbor search without false negatives for the Hamming distance. We show new data structures for solving the c-approximate nearest neighbors problem without false negatives for Euclidean high dimensional space R. These data structures work for any c = ω( √ log logn), where n is the number of points in the input set, with poly-logarithmic query time and polynomial preprocessing time. This improves over the known algorithms, which require c to be Ω( √ d). This improvement is obtained by applying a sequence of reductions, which are interesting on their own. First, we reduce the problem to d instances of dimension logarithmic in n. Next, these instances are reduced to a number of c-approximate nearest neighbor search instances in ( R )L space equipped with metric m(x, y) = max1≤i≤L(‖xi − yi‖2). 1998 ACM Subject Classification F.2.2; G.3
منابع مشابه
Improved approximate near neighbor search without false negatives for $l_2$
We present a new algorithm for the c–approximate nearest neighbor search without false negatives for l 2 . We enhance the dimension reduction method presented in [14] and combine it with the standard results of Indyk and Motwani [10]. We present an efficient algorithm with Las Vegas guaranties for any c > 1. This improves over the previous results, which require c = ω(log logn) [14], where n is...
متن کاملLocality-Sensitive Hashing Without False Negatives for l_p
In this paper, we show a construction of locality-sensitive hash functions without false negatives, i.e., which ensure collision for every pair of points within a given radius R in d dimensional space equipped with lp norm when p ∈ [1,∞]. Furthermore, we show how to use these hash functions to solve the c-approximate nearest neighbor search problem without false negatives. Namely, if there is a...
متن کاملOn fast bounded locality sensitive hashing
In this paper, we examine the hash functions expressed as scalar products, i.e., f(x) =< v, x >, for some bounded random vector v. Such hash functions have numerous applications, but often there is a need to optimize the choice of the distribution of v. In the present work, we focus on so-called anti-concentration bounds, i.e. the upper bounds of P [| < v, x > | < α]. In many applications, v is...
متن کاملA Novel Hybrid Approach for Email Spam Detection based on Scatter Search Algorithm and K-Nearest Neighbors
Because cyberspace and Internet predominate in the life of users, in addition to business opportunities and time reductions, threats like information theft, penetration into systems, etc. are included in the field of hardware and software. Security is the top priority to prevent a cyber-attack that users should initially be detecting the type of attacks because virtual environments are not moni...
متن کاملEFANNA : An Extremely Fast Approximate Nearest Neighbor Search Algorithm Based on kNN Graph
Approximate nearest neighbor (ANN) search is a fundamental problem in many areas of data mining, machine learning and computer vision. The performance of traditional hierarchical structure (tree) based methods decreases as the dimensionality of data grows, while hashing based methods usually lack efficiency in practice. Recently, the graph based methods have drawn considerable attention. The ma...
متن کامل