Packed Compact Tries: A Fast and Efficient Data Structure for Online String Processing
نویسندگان
چکیده
In this paper, we present a new data structure called the packed compact trie (packed c-trie) which stores a set S of k strings of total length n in n log σ+O(k log n) bits of space and supports fast pattern matching queries and updates, where σ is the size of an alphabet. Assume that α = log σ n letters are packed in a single machine word on the standard word RAM model, and let f(k, n) denote the query and update times of the dynamic predecessor/successor data structure of our choice which stores k integers from universe [1, n] in O(k log n) bits of space. Then, given a string of length m, our packed c-tries support pattern matching queries and insert/delete operations in O(m α f(k, n)) worst-case time and in O(m α + f(k, n)) expected time. Our experiments show that our packed c-tries are faster than the standard compact tries (a.k.a. Patricia trees) on real data sets. As an application of our packed c-trie, we show that the sparse suffix tree for a string of length n over prefix codes with k sampled positions, such as evenly-spaced and word delimited sparse suffix trees, can be constructed online in O((n α + k)f(k, n)) worst-case time and O(n α + kf(k, n)) expected time with n log σ+O(k log n) bits of space. When k = O(n α ), by using the state-of-the-art dynamic predecessor/successor data structures, we obtain sub-linear time construction algorithms using only O(n α ) bits of space in both cases. We also discuss an application of our packed c-tries to online LZD factorization.
منابع مشابه
String Processing Algorithms
The thesis describes extensive studies on various algorithms for efficient string processing. Data available in/via computers are often of enormous size, and thus, it is significantly important and necessary to invent timeand space-efficient methods to process them. Most of such data are, in fact, stored and manipulated as strings. String matching is most fundamental in string processing, where...
متن کاملApplications of Succinct Dynamic Compact Tries to Some String Problems
The dynamic compact trie is a fundamental data structure for a wide range of string processing problems. In this paper, we report our recent work on succinct dynamic compact tries that stores a set of strings of total length n in O(n log σ) space supporting pattern matching and insert/delete operations in O((|P |/α)f(n)) time, where P is a pattern string, α = Θ(logσ n), and f(n) = O((log logn) ...
متن کاملDeterministic Indexing for Packed Strings
Given a string S of length n, the classic string indexing problem is to preprocess S into a compact data structure that supports efficient subsequent pattern queries. In the deterministic variant the goal is to solve the string indexing problem without any randomization (at preprocessing time or query time). In the packed variant the strings are stored with several character in a single word, g...
متن کاملارائه روشی پویا جهت پاسخ به پرسوجوهای پیوسته تجمّعی اقتضایی
Data Streams are infinite, fast, time-stamp data elements which are received explosively. Generally, these elements need to be processed in an online, real-time way. So, algorithms to process data streams and answer queries on these streams are mostly one-pass. The execution of such algorithms has some challenges such as memory limitation, scheduling, and accuracy of answers. They will be more ...
متن کاملFaster Dynamic Compact Tries with Applications to Sparse Suffix Tree Construction and Other String Problems
The dynamic compact trie is a fundamental data structure for a wide range of string processing problems. Jansson, Sadakane, and Sung (LNCS 4855, pp.424-435, FSTTCS 2007) presented the dynamic uncompacted trie data structure of n nodes in O(n log σ) space supporting pattern matching in O((|P |/α)f(n)) time and insert/delete operations in O(f(n)) time, where f(n) = ((log logn)/log log logn) is th...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- IEICE Transactions
دوره 100-A شماره
صفحات -
تاریخ انتشار 2016