A highly scalable perfect hashing algorithm
نویسندگان
چکیده
منابع مشابه
Fast and Scalable Minimal Perfect Hashing for Massive Key Sets
Minimal perfect hash functions provide space-efficient and collision-free hashing on static sets. Existing algorithms and implementations that build such functions have practical limitations on the number of input elements they can process, due to high construction time, RAM or external memory usage. We revisit a simple algorithm and show that it is highly competitive with the state of the art,...
متن کاملDistributed perfect hashing for very large key sets
A perfect hash function (PHF) h : S → [0,m− 1] for a key set S ⊆ U of size n, where m ≥ n and U is a key universe, is an injective function that maps the keys of S to unique values. A minimal perfect hash function (MPHF) is a PHF with m = n, the smallest possible range. Minimal perfect hash functions are widely used for memory efficient storage and fast retrieval of items from static sets. In t...
متن کاملPerfect Hashing for Strings: Formalization and Algorithms
Numbers and strings are two objects manipulated by most programs. Hashing has been well-studied for numbers and it has been eeective in practice. In contrast, basic hashing issues for strings remain largely unex-plored. In this paper, we identify and formulate the core hashing problem for strings that we call substring hashing. Our main technical results are highly eecient sequential/parallel (...
متن کاملPerfect hashing using sparse matrix packing
This article presents a simple algorithm for packing sparse 2-D arrays into minimal I-D arrays in O(r?) time. Retrieving an element from the packed I-D array is O(l). This packing algorithm is then applied to create minimal perfect hashing functions for large word lists. Many existing perfect hashing algorithms process large word lists by segmenting them into several smaller lists. The perfect ...
متن کاملUsing Tries to Eliminate Pattern Collisions in Perfect Hashing
4any current perfect hashing algorithms suffer from the problem of pattern collisions. In this paper, a perfect hashing technique that uses array-based tries and a simple sparse matrix packing algorithm is introduced. This technique eliminates all pattern collisions, and because of this it can be used to form ordered minimal perfect hash functions on extremely large word lists. This algorithm i...
متن کامل