Universal lossless data compression with side information by using a conditional MPM grammar transform

نویسندگان

  • En-Hui Yang
  • Alexei Kaltchenko
  • John C. Kieffer
چکیده

A grammar transform is a transformation that converts any data sequence to be compressed into a grammar from which the original data sequence can be fully reconstructed. In a grammar-based code, a data sequence is first converted into a grammar by a grammar transform and then losslessly encoded. Among several recently proposed grammar transforms is the multilevel pattern matching (MPM) grammar transform. In this paper, the MPM grammar transform is first extended to the case of side information known to both the encoder and decoder, yielding a conditional MPM (CMPM) grammar transform. A new simple linear-time and space complexity algorithm is then proposed to implement the MPM and CMPM grammar transforms. Based on the CMPM grammar transform, a universal lossless data compression algorithm with side information is developed, which can achieve asymptotically the conditional entropy rate of any stationary, ergodic source pair. It is shown that the algorithm’s worst case redundancy/sample against the -context conditional empirical entropy among all individual sequences of length is upper-bounded by (1 log ), where is a constant. The proposed algorithm with side information is the first in the coming family of conditional grammar-based codes, whose expected high efficiency is due to the efficiency of the corresponding unconditional codes.

منابع مشابه

Study On Universal Lossless Data Compression by using Context Dependence Multilevel Pattern Matching Grammar Transform

In this paper, the context dependence multilevel pattern matching(in short CDMPM) grammar transform is proposed; based on this grammar transform, the universal lossless data compression algorithm, CDMPM code is then developed. Moreover, it is proved that this algorithms’ worst case redundancy among all individual sequences of length n from a finite alphabet is upper bounded by ) log / 1 ( n C w...

متن کامل

Universal lossless compression via multilevel pattern matching

A universal lossless data compression code called the multilevel pattern matching code (MPM code) is introduced. In processing a finite-alphabet data string of length , the MPM code operates at (log log ) levels sequentially. At each level, the MPM code detects matching patterns in the input data string (substrings of the data appearing in two or more nonoverlapping positions). The matching pat...

متن کامل

Efficient universal lossless data compression algorithms based on a greedy sequential grammar transform - Part one: Without context models

A grammar transform is a transformation that converts any data sequence to be compressed into a grammar from which the original data sequence can be fully reconstructed. In a grammar-based code, a data sequence is first converted into a grammar by a grammar transform and then losslessly encoded. In this paper, a greedy grammar transform is first presented; this grammar transform constructs sequ...

متن کامل

Optimal Universal Lossless Compression with Side Information

This paper presents conditional versions of Lempel-Ziv (LZ) algorithm for settings where compressor and decompressor have access to the same side information. We propose a fixed-length-parsing LZ algorithm with side information, motivated by the Willems algorithm, and prove the optimality for any stationary processes. In addition, we suggest strategies to improve the algorithm which lower the d...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

متن کامل
عنوان ژورنال:
  • IEEE Trans. Information Theory

دوره 47  شماره 

صفحات  -

تاریخ انتشار 2001