Partition Min-Hash for Partial Duplicate Image Discovery

نویسندگان

  • David C. Lee
  • Qifa Ke
  • Michael Isard
چکیده

In this paper, we propose Partition min-Hash (PmH), a novel hashing scheme for discovering partial duplicate images from a large database. Unlike the standard min-Hash algorithm that assumes a bag of words image representation, our approach utilizes the fact that duplicate regions among images are often localized. By theoretical analysis, simulation, and empirical study, we show that PmH outperforms standard min-Hash in terms of precision and recall, while being orders of magnitude faster. When combined with the start-of-the-art Geometric min-Hash algorithm, our approach speeds up hashing by 10 times without losing precision or recall. When given a fixed time budget, our method achieves much higher recall than the state-of-the-art.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Near Duplicate Image Detection: min-Hash and tf-idf Weighting

This paper proposes two novel image similarity measures for fast indexing via locality sensitive hashing. The similarity measures are applied and evaluated in the context of near duplicate image detection. The proposed method uses a visual vocabulary of vector quantized local feature descriptors (SIFT) and for retrieval exploits enhanced min-Hash techniques. Standard min-Hash uses an approximat...

متن کامل

Hash Functions for Near Duplicate Image Retrieval

This paper proposes new hash functions for indexing local image descriptors. These functions are first applied and evaluated as a range neighbor algorithm. We show that it obtains similar results as several state of the art algorithms. In the context of near duplicate image retrieval, we integrated the proposed hash functions within a bag of words approach. Because most of the other methods use...

متن کامل

Identifying and Indexing Near-Duplicate Images Using Optimizing Technique in Web Search

Today's World Wide Web is growing drastically and duplicates occur in many fields. Importantly duplicate images that are uploaded into internet like a food product, document image, medical images, textile fields etc. So it becomes very important to identify those duplicate images. Near duplicates can be similar copies or differ a little in their visual content. Duplicate images introduce many p...

متن کامل

Robust Image Hashing Using NMF and Ring partition for Image Analysis

Image hashing is an efficient technique for indexing and can be effectively used in image retrieval .This paper uses image hashing with a ring partition and a non-negative matrix factorization(NMF),which provides rotation robustness and good discriminative capability respectively. The aim of ring partition in image hashing is to construct a rotation invariant secondary image for dimensionality ...

متن کامل

Compressed Image Hashing using Minimum Magnitude CSLBP

Image hashing allows compression, enhancement or other signal processing operations on digital images which are usually acceptable manipulations. Whereas, cryptographic hash functions are very sensitive to even single bit changes in image. Image hashing is a sum of important quality features in quantized form. In this paper, we proposed a novel image hashing algorithm for authentication which i...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010