Fast Construction of Disposable Prefix-Free Codes
نویسندگان
چکیده
Some data compression techniques use large numbers of prefix-free codes. The following two techniques do so: adaptive Huffman encoding and bit recycling. Adaptive Huffman encoding allows successive symbols to be encoded where each one is encoded according to the statistics of the symbols seen so far. Bit recycling, on the other hand, is a technique that is designed to improve the efficiency of a certain class of compression techniques (that is, the ones that allow for the existence of multiple encodings of the same data) and that repetitively has to build prefix-free codes that are used to encode or decode only one symbol. In the case of adaptive Huffman encoding, the simple but inefficient solution consists in building a prefix-free code from scratch according to the current statistics (using, say, Huffman’s algorithm) before encoding each symbol. However, there exist efficient algorithms for adaptive encoding that take advantage of the fact that the statistics evolve only progressively (e.g., Vitter’s algorithm). Bit recycling, on the other hand, is unlikely to reuse the same, or even a similar, prefix-free code. Consequently, a lot of prefix-free codes need to be constructed from scratch. What we propose is to use a fast technique to construct prefix-free codes. The technique trades speed in exchange of the optimality of the prefix-free codes it builds. We measured that the technique is 3 to 4 times faster than Huffman’s algorithm, while the encodings of the symbols are only 4% or 1.4% longer on average, depending on whether the technique is used in a general context or in a bit-recycling one, respectively.
منابع مشابه
On the construction of prefix-free and fix-free codes with specified codeword compositions
We investigate the construction of prefix-free and fix-free codes with specified codeword compositions. We present a polynomial time algorithm which constructs a fix-free code with the same codeword compositions as a given code for a special class of codes called distinct codes. We consider the construction of optimal fix-free codes which minimize the average codeword cost for general letter co...
متن کاملA Construction for Balancing Non-Binary Sequences Based on Gray Code Prefixes
We introduce a new construction for the balancing of non-binary sequences that make use of Gray codes for prefix coding. Our construction provides full encoding and decoding of sequences, including the prefix. This construction is based on a generalization of Knuth’s parallel balancing approach, which can handle very long information sequences. However, the overall sequence—composed of the info...
متن کاملCompleting prefix codes in submonoids
Let M be a submonoid of the free monoid A∗, and let X ⊆ M be a variable length code (for short a code). X is weakly M-complete if any word in M is a factor of some word in X∗ [J. Néraud, C. Selmi, Free monoid theory: maximality and completeness in arbitrary submonoids, Internat. J. Algorithms Comput. 13(5) (2003) 507–516]. Given a code X ⊆ M , we are interested in the construction of a weakly M...
متن کاملUsing an innovative coding algorithm for data encryption∗
This paper discusses the problem of using data compression for encryption. We first propose an algorithm for breaking a prefix-coded file by enumeration. Based on the algorithm, we respectively analyze the complexity of breaking Huffman codes and Shannon-Fano-Elias codes under the assumption that the cryptanalyst knows the code construction rule and the probability mass function of the source. ...
متن کاملError states and synchronization recovery for variable length codes
The Synchronization Variable length codes. The Construction of Variable length codes with good Synchronization Properties. On the Expected codeword Length per Symbol of optimal prefix codes for extended Sources.
متن کامل