The Normalized Distance Preserving Binary Codes and Distance Table

نویسندگان

  • Hongwei Zhao
  • Zhen Wang
  • Pingping Liu
  • Bin Wu
چکیده

In the Euclidean space, the approximate nearest neighbors (ANN) search measures the similarity degree through computing the Euclidean distances, which owns high time complexity and large memory overhead. To address these problems, this paper maps the data from the Euclidean space into the Hamming space, and the normalized distance similarity restriction and the quantization error are required to satisfy. Firstly, the encoding centers and their binary labels are obtained through a lookup-based mechanism. Then, the candidate hashing functions are learnt under supervision of the binary labels, and the ones which satisfy the entropy criterion are selected to boost the distinctiveness of the learnt binary codes. During the training procedure, multiple groups of the hashing functions are generated based on different kinds of centers, which can weaken the inferior influence of the initial centers. The data with minimal average Hamming distances are returned as the nearest neighbors. In the Hamming space, different Euclidean distances may be substituted by one identical value, thus a distance table is predefined to distinguish the similarity degrees among the data pairs with the same Hamming distance. The final experimental results show that our algorithm is superior to many state-of-the-art methods.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Construction of Linear Codes Having Prescribed Primal-dual Minimum Distance with Applications in Cryptography

A method is given for the construction of linear codes with prescribed minimum distance and also prescribed minimum distance of the dual code. This works for codes over arbitrary finite fields. In the case of binary codes Matsumoto et al. showed how such codes can be used to construct cryptographic Boolean functions. This new method allows to compute new bounds on the size of such codes, extend...

متن کامل

Binary Gray Codes with Long Bit Runs

We show that there exists an n-bit cyclic binary Gray code all of whose bit runs have length at least n − 3 log2 n. That is, there exists a cyclic ordering of {0, 1}n such that adjacent words differ in exactly one (coordinate) bit, and such that no bit changes its value twice in any subsequence of n − 3 log2 n consecutive words. Such Gray codes are ‘locally distance preserving’ in that Hamming ...

متن کامل

Cosine Similarity Search with Multi Index Hashing

Due to rapid development of the Internet, recent years have witnessed an explosion in the rate of data generation. Dealing with data at current scales brings up unprecedented challenges. From the algorithmic view point, executing existing linear algorithms in information retrieval and machine learning on such tremendous amounts of data incur intolerable computational and storage costs. To addre...

متن کامل

Comparing apples to apples in the evaluation of binary coding methods

We discuss methodological issues related to the evaluation of unsupervised binary code construction methods for nearest neighbor search. These issues have been widely ignored in literature. These coding methods attempt to preserve either Euclidean distance or angular (cosine) distance in the binary embedding space. We explain why when comparing a method whose goal is preserving cosine similarit...

متن کامل

Similarity-Preserving Binary Signature for Linear Subspaces

Linear subspace is an important representation for many kinds of real-world data in computer vision and pattern recognition, e.g. faces, motion videos, speeches. In this paper, first we define pairwise angular similarity and angular distance for linear subspaces. The angular distance satisfies non-negativity, identity of indiscernibles, symmetry and triangle inequality, and thus it is a metric....

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • J. Inf. Sci. Eng.

دوره 33  شماره 

صفحات  -

تاریخ انتشار 2017