Implementing the Context Tree Weighting Method for Text Compression

نویسندگان

  • Kunihiko Sadakane
  • Takumi Okazaki
  • Hiroshi Imai
چکیده

Context tree weighting method is a universal compression algorithm for FSMX sources. Though we expect that it will have good compression ratio in practice, it is difficult to implement it and in many cases the implementation is only for estimating compression ratio. Though Willems and Tjalkens showed practical implementation using not block probabilities but conditional probabilities, it is used for only binary alphabet sequences. We extend the method for multi-alphabet sequences and show a simple implementation using PPM techniques. We also propose a method to optimize a parameter of the context tree weighting for binary alphabet case. Experimental results on texts and DNA sequences show that the performance of PPM can be improved by combining the context tree weighting and that DNA sequences can be compressed in less than 2.0 bpc.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The Context-tree Weighting Method: Extensions - Information Theory, IEEE Transactions on

First we modify the basic (binary) context-tree weighting method such that the past symbols x1 D; x2 D; ; x0 are not needed by the encoder and the decoder. Then we describe how to make the context-tree depth D infinite, which results in optimal redundancy behavior for all tree sources, while the number of records in the context tree is not larger than 2T 1: Here T is the length of the source se...

متن کامل

A Study of the Context Tree Maximizing Method

One can adapt the context tree weighting method in such a way, that it will find the minimum description length model (MDL-model) that corresponds to the data. In this paper this new algorithm, the context tree maximizing algorithm, and a few modifications of the algorithm will be studied, in particular, its performance if we apply it for data compression.

متن کامل

Context-Tree Weighting and Maximizing: Processing Betas

The context-tree weighting method (Willems, Shtarkov, and Tjalkens [1995]) is a sequential universal source coding method that achieves the Rissanen lower bound [1984] for tree sources. The same authors also proposed context-tree maximizing, a two-pass version of the context-tree weighting method [1993]. Later Willems and Tjalkens [1998] described a method based on ratios (betas) of sequence pr...

متن کامل

Arithmetic Coding with Adaptive Context-Tree Weighting for the H.264 Video Coders

We propose applying an adaptive context-tree weighting (CTW) method in the H.264 video coders. We first investigate two different ways to incorporating the CTW method into an H.264 coder and compare the coding effectiveness of using the method with that of using the context models specified in the H.264 standard. We then describe a novel approach for automatically adapting the CTW method based ...

متن کامل

The Context-Tree Weighting Method : Extensions

First we modify the basic (binary) context-tree weighting method such that the past symbols x1 D; x2 D; ; x0 are not needed by the encoder and the decoder. Then we describe how to make the context-tree depth D infinite, which results in optimal redundancy behavior for all tree sources, while the number of records in the context tree is not larger than 2T 1: Here T is the length of the source se...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000