Bit-Optimal Lempel-Ziv compression
نویسندگان
چکیده
One of the most famous and investigated lossless data-compression scheme is the one introduced by Lempel and Ziv about 40 years ago [23]. This compression scheme is known as ”dictionary-based compression” and consists of squeezing an input string by replacing some of its substrings with (shorter) codewords which are actually pointers to a dictionary of phrases built as the string is processed. Surprisingly enough, although many fundamental results are nowadays known about upper bounds on the speed and effectiveness of this compression process (see e.g. [12, 16] and references therein), “we are not aware of any parsing scheme that achieves optimality when the LZ77-dictionary is in use under any constraint on the codewords other than being of equal length” [16, pag. 159]. Here optimality means to achieve the minimum number of bits in compressing each individual input string, without any assumption on its generating source. In this paper we provide the first LZ-based compressor which computes the bit-optimal parsing of any input string in efficient time and optimal space, for a general class of variable-length codeword encodings which encompasses most of the ones typically used in data compression and in the design of search engines and compressed indexes [14, 17, 22].
منابع مشابه
On the bit-complexity of Lempel-Ziv compression
One of the most famous and investigated lossless data-compression schemes is the one introduced by Lempel and Ziv about 30 years ago [37]. This compression scheme is known as “dictionary-based compressor” and consists of squeezing an input string by replacing some of its substrings with (shorter) codewords which are actually pointers to a dictionary of phrases built as the string is processed. ...
متن کاملHardware Approach of Lempel-Ziv-Welch Algorithm for Binary Data Compression
In a distributed environment, large data files remain a major bottleneck. Compression is an important component of the solutions available for creating file sizes of manageable and transmittable dimensions. When high-speed media or channels are used, high-speed data compression is desired. Software implementations are often not fast enough. In this paper, we present the very high speed hardware...
متن کاملA Modified Lempel–ziv Welch Source Coding Algorithm for Efficient Data Compression
Lempel–Ziv Welch (LZW) algorithm is a well-known powerful data compression algorithm created by Abraham Lempel, Jacob Ziv, and Terry Welch. The algorithm is designed to be fast to implement but is not usually optimal because it performs only limited analysis of the data. A modified LZW algorithm on source coding will be proposed in this paper to improve the compression efficiency of the existin...
متن کاملOptimal Universal Lossless Compression with Side Information
This paper presents conditional versions of Lempel-Ziv (LZ) algorithm for settings where compressor and decompressor have access to the same side information. We propose a fixed-length-parsing LZ algorithm with side information, motivated by the Willems algorithm, and prove the optimality for any stationary processes. In addition, we suggest strategies to improve the algorithm which lower the d...
متن کاملOptimal Lempel-Ziv based lossy compression for memoryless data: how to make the right mistakes
Compression refers to encoding data using bits, so that the representation uses as few bits as possible. Compression could be lossless: i.e. encoded data can be recovered exactly from its representation) or lossy where the data is compressed more than the lossless case, but can still be recovered to within prespecified distortion metric. In this paper, we prove the optimality of Codelet Parsing...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/0802.0835 شماره
صفحات -
تاریخ انتشار 2008