How to reduce a bigger string in smaller string in C++? Probably by hashing?

只愿长相守 提交于 2019-12-01 12:33:26

Well, if you don't need to uncompress it later:

string s = "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx";
s = "";

Edit: Sounds like you want a hash function - there are a zillion out there, depending on your requirements. Google is your friend.

As unaperson said, Google is your friend: Data Compression Algorithms.

Here are a few off the top of my head:
RLE -- Run Length encoded
Huffman
Lepel-Ziv

As I understand from comments of question you don't need to decompress it, but want only for unique strings get unique result. The hashing algorithm which I'm going to explain very easy to understand and works perfect (I've used it lots of times in my practice). It is very simple rolling hash function which is used in Rabin-Karp string search algorithms.

Ok let's consider each string as number in 257-base system (because 257 is prime number). Examples:

  1. "10" = code('1') * 257 + code('0')
  2. "p:;" = code('p') * 257^2 + code('0') * 257 + code(';')

Where code(char a) is ascii code of character a + 1 (taking +1 to give different result for strings '\0'(n times) and '\0'(m times)). Of course if the string is big then it's appropriate number can't be stored in int or event in unsigned long long. But it's not a problem and you can just MOD it to MAX_SIZE of data-type where you going to store it. So the final code of your hash function is fallowing.

unsigned long long hash(const string & s)
{
    unsigned long long ret = 0;
    for(int i = 0; i < s.size(); ++i)
    {
        ret *= 257;
        ret += s[i] + 1;
    }
    return ret;
}

EDIT: Added source of this algorithm.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!