问题
I have read that lz4 algorithm is very fast and has pretty good compression. But in my test app compressed text is larger than the source text. What is the problem?
srand(time(NULL));
std::string text;
for (int i = 0; i < 65535; ++i)
text.push_back((char)(0 + rand() % 256));
cout << "Text size: " << text.size() << endl;
char *compressedData = new char[text.size() * 2];
int compressedSize = LZ4_compress(text.c_str(), text.size(), compressedData);
cout << "Compressed size: " << compressedSize << endl;
I also tried LZ4_compress, but result is the same. But if I generate string with same symbols or say with two different symbols, then compression is present.
回答1:
Have a look at a description of the LZ4 algorithm. It references common substrings within the compressed text. It uses the already output text as a dictionary.
Random text or any other material without repeating sequences of any length will not compress well using it. For that plaintext, a bit compression algorithm will probably do better.
来源:https://stackoverflow.com/questions/31839274/lz4-compressed-text-is-larger-than-uncompressed