A Separation Between Run-Length SLPs and LZ77
نویسندگان
چکیده
In this paper we give an infinite family of strings for which the length of the Lempel-Ziv’77 parse is a factor Ω(log n/ log log n) smaller than the smallest run-length grammar.
منابع مشابه
Distance Measures for Sequences
Given a set of sequences, the distance between pairs of them helps us to find their similarity and derive structural relationship amongst them. For genomic sequences such measures make it possible to construct the evolution tree of organisms. In this paper we compare several distance measures and examine a method that involves circular shifting one sequence against the other for finding good al...
متن کاملFingerprints in Compressed Strings
The Karp-Rabin fingerprint of a string is a type of hash value that due to its strong properties has been used in many string algorithms. In this paper we show how to construct a data structure for a string S of size N compressed by a context-free grammar of size n that answers fingerprint queries. That is, given indices i and j, the answer to a query is the fingerprint of the substring S[i, j]...
متن کاملFrom LZ77 to the Run-Length Encoded Burrows-Wheeler Transform, and Back
The Lempel-Ziv factorization (LZ77) and the Run-Length encoded BurrowsWheeler Transform (RLBWT) are two important tools in text compression and indexing, being their sizes z and r closely related to the amount of text self-repetitiveness. In this paper we consider the problem of converting the two representations into each other within a working space proportional to the input and the output. L...
متن کاملComposite Repetition-Aware Data Structures
In highly repetitive strings, like collections of genomes from the same species, distinct measures of repetition all grow sublinearly in the length of the text, and indexes targeted to such strings typically depend only on one of these measures. We describe two data structures whose size depends on multiple measures of repetition at once, and that provide competitive tradeoffs between the time ...
متن کاملRestructuring Compressed Texts without Explicit Decompression
We consider the problem of restructuring compressed texts without explicit decompression. We present algorithms which allow conversions from compressed representations of a string T produced by any grammar-based compression algorithm, to representations produced by several specific compression algorithms including LZ77, LZ78, run length encoding, and some grammar based compression algorithms. T...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1711.07270 شماره
صفحات -
تاریخ انتشار 2017