Is there an algorithm for “perfect” compression?

后端未结

关注

 2  1593

Let me clarify, I\'m not talking about perfect compression in the sense of an algorithm that is able to compress any given source material, I realize that is impossible. What I\

相关标签:

2条回答

花落未央

2021-02-08 20:48

As stated by Mark, the general answer is "no", due to Kolmogorov complexity. Let me expand a bit on that.

Compression is basically two steps : 1) Model 2) Entropy

The role of the model is to "guess" the next bytes or fields to come. Model can have any form, and there is no limit to its effectiveness. A trivial example is a random number generator function : from an external perspective, it looks like a noise, and therefore cannot be compressed. But if you know the generation function, an infinitely long sequence can be compressed into a small set of code, the generator function.

That's why there is "no limit", and Kolmogorov complexity just states that : you can never guarantee that there is not a better way to "model" the data.

The second part is computable : Entropy is where you find the "Shannon Limit". Given a set of symbols (typically, the output symbols from the model), which are part of an alphabet, you can compute the optimal cost, and find a way to reach the proven ultimate compression limit, which is the Shannon limit.

Huffman is optimal with regards to the Shannon limit if you accept the limitation that each symbol must be encoded using an integer number of bits. This is close but imperfect approximation. Better compression can be achieved by using fractional bits, which is what Arithmetic Coders do offer, or the more recent ANS-based Finite State Entropy coder. Both get much closer to the Shannon limit.

The Shannon limit only applies if you treat a set of symbols "individually". As soon as you try to "combine them", or find any correlations between the symbols, you are "modeling". And this is the territory of Kolmogorov Complexity, which is not computable.

0 讨论(0)
发布评论:

提交评论
- 加载中...
借酒劲吻你

2021-02-08 21:06

No. It can be proven that there is not even an algorithm to determine how well a perfect compressor will do. See Kolmogorov Complexity.

Huffman coding (or arithmetic coding) by itself does not get close to the best compression. Other techniques need to be used to take advantage of higher order redundancies in the data.

0 讨论(0)
发布评论:

提交评论
- 加载中...