Average Profile of the Lempel-Ziv Parsing Scheme for Markovian Source

نویسندگان

Philippe Jacquet

Wojciech Szpankowski

Jing Tang

چکیده

Jing Tang Microsoft Corporation One Microsoft Way, 1/2061 Redmond, WA 98052 U.S.A. [email protected] For a Markovian source, we analyze the Lempel-Ziv parsing scheme that partitions sequences into phrases such that a new phrase is the shortest phrase not seen in the past. We consider three models: In the Markov Independent model, several sequences are generated independently by Markovian sources, and the ith phrase is the shortest prefix of the ith sequence that was not seen before as a phrase (i.c., a prefix of previous (i 1) sequences). In the other two models, only a single sequence is generated by a Markovian source. In the second model, for which we coin the name Gilbert-Kadota model, a fixed number of phrases is generated according to the Lempel-Ziv algorithm, thus producing a sequence of a variable (random) length. In the last model, known also as the Lempel-Ziv model, a string of fixed length is partitioned into a variable (random) number of phrases. These three models can be efficiently represented and analyzed by digital search trees that are of interest to other algorithms such as sorting, searching and pattern matching. In this paper, we concentrate on analyzing the average profile (i.e., the average number of phrases of a given length), the typical phrase length, and the length of the last phrase. We obtain asymptotic expansions for the mean and the variance of the phrase length, and we prove that appropria.tely normalized phrase length in all three models tends to the standard normal distribution which lead to bounds on the average redundancy of the Lempel-Ziv code. For Markov Independent model, this finding is established by analytic methods (i.e., generating functions, Mellin transform and depoissonization), while for the other two models we use a combination of analytic and probabilistic analyses.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Average Profile of the Lempel - Ziv Parsing Scheme for Amarkovian

For a Markovian source, we analyze the Lempel-Ziv parsing scheme that partitions sequences into phrases such that a new phrase is the shortest phrase not seen in the past. We consider three models: In the Markov Independent model, several sequences are generated independently by Markovian sources, and the ith phrase is the shortest preex of the ith sequence that was not seen before as a phrase ...

متن کامل

Universal coding of nonstationary sources

In this correspondence we investigate the performance of the Lempel–Ziv incremental parsing scheme on nonstationary sources. We show that it achieves the best rate achievable by a finite-state block coder for the nonstationary source. We also show a similar result for a lossy coding scheme given by Yang and Kieffer which uses a Lempel–Ziv scheme to perform lossy coding.

متن کامل

On Generalized Digital Search Trees with Applicationsto a Generalized Lempel - Ziv

The goal of this research is twofold: (i) to analyze generalized digital search trees, and (ii) to derive the average proole (i.e., phrase length) of a generalization of the well known parsing algorithm due to Lempel and Ziv. In the generalized Lempel-Ziv parsing scheme, one partitions a sequence of symbols from a nite alphabet into phrases such that the new phrase is the longest substring seen...

متن کامل

Bit-Optimal Lempel-Ziv compression

One of the most famous and investigated lossless data-compression scheme is the one introduced by Lempel and Ziv about 40 years ago [23]. This compression scheme is known as ”dictionary-based compression” and consists of squeezing an input string by replacing some of its substrings with (shorter) codewords which are actually pointers to a dictionary of phrases built as the string is processed. ...

متن کامل

Average profile and limiting distribution for a phrase size in the Lempel-Ziv parsing algorithm

Wojciech Szpankowskl* Department of Computer Science Purdue University W. Lafayette, IN 47907 U.S.A. Consider the parsing algorithm due to Lempel and Ziv that partitions a sequence of length n into variable phrases (blocks) such that a new block is the shortest substring not seen in the past as a phrase. In practice the following parameters are of interest: number of phrases, the size of a phra...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2013

Average Profile of the Lempel-Ziv Parsing Scheme for Markovian Source

نویسندگان

چکیده

منابع مشابه

Average Profile of the Lempel - Ziv Parsing Scheme for Amarkovian

Universal coding of nonstationary sources

On Generalized Digital Search Trees with Applicationsto a Generalized Lempel - Ziv

Bit-Optimal Lempel-Ziv compression

Average profile and limiting distribution for a phrase size in the Lempel-Ziv parsing algorithm

عنوان ژورنال:

اشتراک گذاری