Compressed Data Structures for Dynamic Sequences
نویسندگان
چکیده
We consider the problem of storing a dynamic string S over an alphabetΣ = { 1, . . . , σ } in compressed form. Our representation supports insertions and deletions of symbols and answers three fundamental queries: access(i, S) returns the i-th symbol in S, ranka(i, S) counts how many times a symbol a occurs among the first i positions in S, and selecta(i, S) finds the position where a symbol a occurs for the i-th time. We present the first fully-dynamic data structure for arbitrarily large alphabets that achieves optimal query times for all three operations and supports updates with worst-case time guarantees. Ours is also the first fully-dynamic data structure that needs only nHk+o(n log σ) bits, where Hk is the k-th order entropy and n is the string length. Moreover our representation supports extraction of a substring S[i..i + ] in optimal O(log n/ log log n+ / logσ n) time.
منابع مشابه
Space-efficient Data Structures for Collections of Textual Data
This thesis focuses on the design of succinct and compressed data structures for collections of string-based data, specifically sequences of semi-structured documents in textual format, sets of strings, and sequences of strings. The study of such collections is motivated by a large number of applications both in theory and practice. For textual semi-structured data, we introduce the concept of ...
متن کاملA Framework of Dynamic Data Structures for String Processing
In this paper we present DYNAMIC, an open-source C++ library implementing dynamic compressed data structures for string manipulation. Our framework includes useful tools such as searchable partial sums, succinct/gap-encoded bitvectors, and entropy/run-length compressed strings and FM-indexes. We prove close-to-optimal theoretical bounds for the resources used by our structures, and show that ou...
متن کاملPractical aspects of Compressed Suffix Arrays and FM-Index in Searching DNA Sequences
Searching patterns in the DNA sequence is an important step in biological research. To speed up the search process, one can index the DNA sequence. However, classical indexing data structures like suffix trees and suffix arrays are not feasible for indexing DNA sequences due to main memory requirement, as DNA sequences can be very long. In this paper, we evaluate the performance of two compress...
متن کاملGrammar Compressed Sequences
Sequence representations supporting not only direct access to their symbols, but also rank/select operations, are a fundamental building block in many compressed data structures. Several recent applications need to represent highly repetitive sequences, and classical statistical compression proves ineffective. We introduce, instead, grammar-based representations for repetitive sequences, which ...
متن کاملComparison of Seismic Behavior of Buckling-restrained Braces and Yielding Brace System in Irregular and Regular Steel Frames under Mainshock and Mainshock-Aftershock
Due to low stiffness of braces after yielding, the structures with buckling-restrained braces (BRBs) experience high residual drifts during an earthquake, which can be intensified by aftershocks and causes considerable damages to structures. Also, due to poor distribution of stiffness, this problem is exacerbated for irregular structures. Recently, the yielding brace system (YBS) has been intro...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2015