Exploring Bit-Difference for Approximate KNN Search in High-dimensional Databases

نویسندگان

  • Bin Cui
  • Heng Tao Shen
  • Jialie Shen
  • Kian-Lee Tan
چکیده

In this paper, we develop a novel index structure to support efficient approximate k-nearest neighbor (KNN) query in high-dimensional databases. In high-dimensional spaces, the computational cost of the distance (e.g., Euclidean distance) between two points contributes a dominant portion of the overall query response time for memory processing. To reduce the distance computation, we first propose a structure (BID) using BIt-Difference to answer approximate KNN query. The BID employs one bit to represent each feature vector of point and the number of bit-difference is used to prune the further points. To facilitate real dataset which is typically skewed, we enhance the BID mechanism with clustering, cluster adapted bitcoder and dimensional weight, named the BID. Extensive experiments are conducted to show that our proposed method yields significant performance advantages over the existing index structures on both real life and synthetic high-dimensional datasets.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Optimisation of correlation matrix memory prognostic and diagnostic systems

Condition monitoring systems for prognostics and diagnostics can enable large and complex systems to be operated more safely, at a lower cost and have a longer lifetime than is possible without them. AURA Alert is a condition monitoring system that uses a fast approximate k Nearest Neighbour (kNN) search of a timeseries database containing known system states to identify anomalous system behavi...

متن کامل

Search Based Weighted Multi-Bit Flipping Algorithm for High-Performance Low-Complexity Decoding of LDPC Codes

In this paper, two new hybrid algorithms are proposed for decoding Low Density Parity Check (LDPC) codes. Original version of the proposed algorithms named Search Based Weighted Multi Bit Flipping (SWMBF). The main idea of these algorithms is flipping variable multi bits in each iteration, change in which leads to the syndrome vector with least hamming weight. To achieve this, the proposed algo...

متن کامل

Search Based Weighted Multi-Bit Flipping Algorithm for High-Performance Low-Complexity Decoding of LDPC Codes

In this paper, two new hybrid algorithms are proposed for decoding Low Density Parity Check (LDPC) codes. Original version of the proposed algorithms named Search Based Weighted Multi Bit Flipping (SWMBF). The main idea of these algorithms is flipping variable multi bits in each iteration, change in which leads to the syndrome vector with least hamming weight. To achieve this, the proposed algo...

متن کامل

Adaptive Quantization of the High-Dimensional Data for Efficient KNN Processing

In this paper, we present a novel index structure, called the SA-tree, to speed up processing of high-dimensional K-nearest neighbor (KNN) queries. The SA-tree employs data clustering and compression, i.e. utilizes the characteristics of each cluster to adaptively compress feature vectors into bit-strings. Hence our proposed mechanism can reduce the disk I/O and computational cost significantly...

متن کامل

Redundant Bit Vectors for Quickly Searching High-Dimensional Regions

Applications such as audio fingerprinting require search in high dimensions: find an item in a database that is similar to a query. An important property of this search task is that negative answers are very frequent: much of the time, a query does not correspond to any database item. We propose Redundant Bit Vectors (RBVs): a novel method for quickly solving this search problem. RBVs rely on t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005