Block trees
نویسندگان
چکیده
Let string S[1..n] be parsed into z phrases by the Lempel-Ziv algorithm. The corresponding compression algorithm encodes S in O(z) space, but it does not support random access to S. We introduce a data structure, block tree, that represents O(zlog(n/z)) space and extracts any symbol of time O(log(n/z)), among other space-time tradeoffs. structure also supports queries are useful for building compressed structures on top Further, trees can built linear scalable manner. Our experiments show offer relevant tradeoffs compared representations highly repetitive strings.
منابع مشابه
Two-Dimensional Block Trees
The Block Tree (BT) is a novel compact data structure designed to compress sequence collections. It obtains compression ratios close to Lempel-Ziv and supports efficient direct access to any substring. The BT divides the text recursively into fixed-size blocks and those appearing earlier are represented with pointers. On repetitive collections, a few blocks can represent all the others, and thu...
متن کاملThe Context Trees of Block Sorting Compression
The Burrows-Wheeler transform (BWT) and block sorting compression are closely related to the context trees of PPM. The usual approach of treating BWT as merely a permutation is not able to fully exploit this relation. We show that an explicit context tree for BWT can be efficiently generating by taking a subset of the corresponding suffix tree, identify the central problems in exploiting its st...
متن کاملThe Block Connectivity of Random Trees
Let r, m, and n be positive integers such that rm = n. For each i ∈ {1, . . . ,m} let Bi = {r(i − 1) + 1, . . . , ri}. The r-block connectivity of a tree on n labelled vertices is the vertex connectivity of the graph obtained by collapsing the vertices in Bi, for each i, to a single (pseudo-)vertex vi. In this paper we prove that, for fixed values of r, with r ≥ 2, the r-block connectivity of a...
متن کاملA Self-index on Block Trees
The Block Tree is a recently proposed data structure that reaches compression close to Lempel-Ziv while supporting efficient direct access to text substrings. In this paper we show how a self-index can be built on top of a Block Tree so that it provides efficient pattern searches while using space proportional to that of the original data structure. More precisely, if a LempelZiv parse cuts a t...
متن کاملImplementing block-stored prefix trees in XML-DBMS
The problem of search efficiency through large amount of text data is well-known problem in computer science. We would like to introduce a BST data structure that allows searches through a set of string values, and is optimized for reading and writing large blocks of data. This paper describes the algorithms for insertion, deletion and search of variable-length strings in diskresident trie stru...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Journal of Computer and System Sciences
سال: 2021
ISSN: ['1090-2724', '0022-0000']
DOI: https://doi.org/10.1016/j.jcss.2020.11.002