LZ4 compressed text is larger than uncompressed

问题

I have read that lz4 algorithm is very fast and has pretty good compression. But in my test app compressed text is larger than the source text. What is the problem?

srand(time(NULL));
std::string text;
for (int i = 0; i < 65535; ++i)
    text.push_back((char)(0 + rand() % 256));

cout << "Text size: " << text.size() << endl;

char *compressedData = new char[text.size() * 2];
int compressedSize = LZ4_compress(text.c_str(), text.size(), compressedData);

cout << "Compressed size: " << compressedSize << endl;

I also tried LZ4_compress, but result is the same. But if I generate string with same symbols or say with two different symbols, then compression is present.

回答1:

Have a look at a description of the LZ4 algorithm. It references common substrings within the compressed text. It uses the already output text as a dictionary.

Random text or any other material without repeating sequences of any length will not compress well using it. For that plaintext, a bit compression algorithm will probably do better.

来源：https://stackoverflow.com/questions/31839274/lz4-compressed-text-is-larger-than-uncompressed

标签

compression

lzw

lz4

易学教程内所有资源均来自网络或用户发布的内容，如有违反法律规定的内容欢迎反馈！
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!