Bloom Filters via d - Left Hashing and Dynamic Bit Reassignment Extended
نویسندگان
چکیده
In recent work, the authors introduced a data structure with the same functionality as a counting Bloom filter (CBF) based on fingerprints and the d-left hashing technique. This paper describes dynamic bit reassignment, an approach that allows the size of the fingerprint to flexibly change with the load in each hash bucket, thereby reducing the probability of a false positive. This technique allows us to not only improve our d-left counting Bloom filter, but also to construct a data structure with the same functionality as a Bloom filter, including the ability to handle insertions online, that yields fewer false positives for sufficiently large filters. Our results show that our d-left Bloom filter data structure begins achieving smaller false positive rates than the standard construction at 16 bits per element. We explain the technique, describe why it is amenable to hardware implementation, and provide experimental results.
منابع مشابه
An Improved Construction for Counting Bloom Filters
A counting Bloom filter (CBF) generalizes a Bloom filter data structure so as to allow membership queries on a set that can be changing dynamically via insertions and deletions. As with a Bloom filter, a CBF obtains space savings by allowing false positives. We provide a simple hashing-based alternative based on d-left hashing called a d-left CBF (dlCBF). The dlCBF offers the same functionality...
متن کاملUsing the Power of Two Choices to Improve Bloom Filters
We consider the combination of two ideas from the hashing literature: the power of two choices and Bloom filters. Specifically, we show via simulations that, in comparison with a standard Bloom filter, using the power of two choices can yield modest reductions in the false positive probability using the same amount of space and more hashing. While the improvements are sufficiently small that th...
متن کاملEfficient Cryptanalysis of Bloom Filters for Privacy-Preserving Record Linkage
Privacy-preserving record linkage (PPRL) is the process of identifying records that represent the same entity across databases held by different organizations without revealing any sensitive information about these entities. A popular technique used in PPRL is Bloom filter encoding, which has shown to be an efficient and effective way to encode sensitive information into bit vectors while still...
متن کاملImproving counting Bloom filter performance with fingerprints
a r t i c l e i n f o a b s t r a c t Bloom filters (BFs) are used in many applications for approximate check of set membership. Counting Bloom filters (CBFs) are an extension of BFs that enable the deletion of entries at the cost of additional storage requirements. Several alternatives to CBFs can be used to reduce the storage overhead. For example schemes based on d-left hashing or Cuckoo has...
متن کاملmd5bloom: Forensic filesystem hashing revisited
Hashing is a fundamental tool in digital forensic analysis used both to ensure data integrity and to efficiently identify known data objects. However, despite many years of practice, its basic use has advanced little. Our objective is to leverage advanced hashing techniques in order to improve the efficiency and scalability of digital forensic analysis. Specifically, we explore the use of Bloom...
متن کامل