A Fast, Minimal Memory, Consistent Hash Algorithm
نویسندگان
چکیده
We present jump consistent hash, a fast, minimal memory, consistent hash algorithm that can be expressed in about 5 lines of code. In comparison to the algorithm of Karger et al., jump consistent hash requires no storage, is faster, and does a better job of evenly dividing the key space among the buckets and of evenly dividing the workload when the number of buckets changes. Its main limitation is that the buckets must be numbered sequentially, which makes it more suitable for data storage applications than for distributed web caching. Introduction Karger et al. [ 1 ] introduced the concept of consistent hashing and gave an algorithm to implement it. Consistent hashing specifies a distribution of data among servers in such a way that servers can be added or removed without having to totally reorganize the data. It was originally proposed for web caching on the Internet, in order to address the problem that clients may not be aware of the entire set of cache servers. Since then, consistent hashing has also seen wide use in data storage applications. Here, it addresses the problem of splitting data into a set of shards, where each shard is typically managed by a single server (or a small set of replicas). As the total amount of data changes, we may want to increase or decrease the number of shards. This requires moving data in order to split the data evenly among the new set of shards, and we would like to move as little data as possible while doing so. Assume, for example, that data consisting of keyvalue pairs is to be split into 10 shards. A simple way to split the data is to compute a hash, h(key), of each key, and store the corresponding keyvalue pair in shard number h(key) mod 10. But if the amount of data grows, and now needs 12 shards to hold it, the simple approach would now assign each key to shard h(key) mod 12, which is probably not the same as h(key) mod 10; the data would need to be completely rearranged among the shards. But it is only necessary to move 1/6 of the data stored in the 10 shards in order to end up with the data balanced among 12 shards. Consistent hashing provides this. Our jump consistent hash function takes a key and a number of buckets (i.e., shards), and returns one …
منابع مشابه
Fast and Scalable Minimal Perfect Hashing for Massive Key Sets
Minimal perfect hash functions provide space-efficient and collision-free hashing on static sets. Existing algorithms and implementations that build such functions have practical limitations on the number of input elements they can process, due to high construction time, RAM or external memory usage. We revisit a simple algorithm and show that it is highly competitive with the state of the art,...
متن کاملA Family of Perfect Hashing Methods
Minimal perfect hash functions are used for memory efficient storage and fast retrieval of items from static sets. We present an infinite family of efficient and practical algorithms for generating order preserving minimal perfect hash functions. We show that almost all members of the family construct space and time optimal order preserving minimal perfect hash functions, and we identify the on...
متن کاملGraphs, Hypergraphs and Hashing
Minimal perfect hash functions are used for memory efficient storage and fast retrieval of items from static sets. We present an infinite family of efficient and practical algorithms for generating minimal perfect hash functions which allow an arbitrary order to be specified for the keys. We show that almost all members of the family are space and time optimal, and we identify the one with mini...
متن کاملGenerating Minimal Perfect Hash Functions
The randomized, deterministic and parallel algorithms for generating minimal perfect hash functions (MPHF) are proposed. Given a set of keys, W, which are character strings over some alphabet, the algorithms using a three-step approach (mapping, ordering, searching) nd the MPHF of the form h(w) = (h0(w) + g(h1(w)) + g(h2(w)))mod m, w 2 W, where h0, h1, h2 are auxiliary pseudorandom functions, m...
متن کاملA New Algorithm for Constructing Minimal Perfect Hash Functions
We present a three-step algorithm for generating minimal perfect hash functions which runs very fast in practice. The first step is probabilistic and involves the generation of random graphs. The second step determines the order in which hash values are assigned to keys. The third step assigns hash values to the keys. We give strong evidences that first step takes linear random time and the sec...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1406.2294 شماره
صفحات -
تاریخ انتشار 2014