Structured Grammar-based Codes for Universal Lossless Data Compression∗

نویسندگان

  • JOHN KIEFFER
  • EN-HUI YANG
چکیده

A grammar-based code losslessly compresses each finite-alphabet data string x by compressing a context-free grammar Gx which represents x in the sense that the language of Gx is {x}. In an earlier paper, we showed that if the grammar Gx is a type of grammar called irreducible grammar for every data string x, then the resulting grammar-based code has maximal redundancy/sample O(log log n/ log n) for n data samples. To further reduce the maximal redundancy/sample, in the present paper, we first decompose a context-free grammar into its structure and its data content, then encode the data content conditional on the structure, and finally replace the irreducible grammar condition with a mild condition on the structures of all grammars used to represent distinct data strings of a fixed length. The resulting grammar-based codes are called structured grammar-based codes. We prove a coding theorem which shows that a structured grammar-based code has maximal redundancy/sample O(1/ log n) provided that a weak regular structure condition is satisfied.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Grammar-based codes: A new class of universal lossless source codes

We investigate a type of lossless source code called a grammar-based code, which, in response to any input data string over a fixed finite alphabet, selects a context-free grammar representing in the sense that is the unique string belonging to the language generated by . Lossless compression of takes place indirectly via compression of the production rules of the grammar . It is shown that, su...

متن کامل

Universal lossless data compression with side information by using a conditional MPM grammar transform

A grammar transform is a transformation that converts any data sequence to be compressed into a grammar from which the original data sequence can be fully reconstructed. In a grammar-based code, a data sequence is first converted into a grammar by a grammar transform and then losslessly encoded. Among several recently proposed grammar transforms is the multilevel pattern matching (MPM) grammar ...

متن کامل

Study On Universal Lossless Data Compression by using Context Dependence Multilevel Pattern Matching Grammar Transform

In this paper, the context dependence multilevel pattern matching(in short CDMPM) grammar transform is proposed; based on this grammar transform, the universal lossless data compression algorithm, CDMPM code is then developed. Moreover, it is proved that this algorithms’ worst case redundancy among all individual sequences of length n from a finite alphabet is upper bounded by ) log / 1 ( n C w...

متن کامل

Universal Coding for Lossless and Lossy Complementary Delivery Problems

This paper deals with a coding problem called complementary delivery, where messages from two correlated sources are jointly encoded and each decoder reproduces one of two messages using the other message as the side information. Both lossless and lossy universal complementary delivery coding schemes are investigated. In the lossless case, it is demonstrated that a universal complementary deliv...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002