Is there an algorithm for “perfect” compression?

后端 未结 2 1591
夕颜
夕颜 2021-02-08 20:12

Let me clarify, I\'m not talking about perfect compression in the sense of an algorithm that is able to compress any given source material, I realize that is impossible. What I\

2条回答
  •  花落未央
    2021-02-08 20:48

    As stated by Mark, the general answer is "no", due to Kolmogorov complexity. Let me expand a bit on that.

    Compression is basically two steps : 1) Model 2) Entropy

    The role of the model is to "guess" the next bytes or fields to come. Model can have any form, and there is no limit to its effectiveness. A trivial example is a random number generator function : from an external perspective, it looks like a noise, and therefore cannot be compressed. But if you know the generation function, an infinitely long sequence can be compressed into a small set of code, the generator function.

    That's why there is "no limit", and Kolmogorov complexity just states that : you can never guarantee that there is not a better way to "model" the data.

    The second part is computable : Entropy is where you find the "Shannon Limit". Given a set of symbols (typically, the output symbols from the model), which are part of an alphabet, you can compute the optimal cost, and find a way to reach the proven ultimate compression limit, which is the Shannon limit.

    Huffman is optimal with regards to the Shannon limit if you accept the limitation that each symbol must be encoded using an integer number of bits. This is close but imperfect approximation. Better compression can be achieved by using fractional bits, which is what Arithmetic Coders do offer, or the more recent ANS-based Finite State Entropy coder. Both get much closer to the Shannon limit.

    The Shannon limit only applies if you treat a set of symbols "individually". As soon as you try to "combine them", or find any correlations between the symbols, you are "modeling". And this is the territory of Kolmogorov Complexity, which is not computable.

提交回复
热议问题