A minimal hash function for C?

后端 未结 6 532
攒了一身酷
攒了一身酷 2021-01-29 23:41

I can\'t use boost:hash because I have to stick with C and can\'t use C++.

But, I need to hash a large number (10K to 100k) of tokens strings (5 to 40 bytes length) so t

相关标签:
6条回答
  • 2021-01-30 00:00

    You can find a good (and fast) hash function, and an interesting read, at http://www.azillionmonkeys.com/qed/hash.html

    The only time you should not check for collisions, is if you use a perfect hash -- a good old fashioned lookup table, like gperf.

    0 讨论(0)
  • 2021-01-30 00:00

    xxhash is quite fast and easy option. A simple code would use XXH32 function:

    unsigned int XXH32 (const void* input, int len, unsigned int seed);
    

    It is 32 bit hash. Since len is int, for larger data more than 2^31-1 bytes use these:

    void*         XXH32_init   (unsigned int seed);
    XXH_errorcode XXH32_update (void* state, const void* input, int len);
    unsigned int  XXH32_digest (void* state);
    
    0 讨论(0)
  • 2021-01-30 00:01
    1. Here is a nice overview of the most notable known hash functions.

    2. 32bits should work just fine.

    3. You always need to check for collisions, unless you want to write a funny hashtable :)

    0 讨论(0)
  • 2021-01-30 00:08

    A general hash function for hash table lookup. It specifies Do NOT use for cryptographic purposes, but since you specified that you have no intent for that then you should be ok.

    It Included is A Survey of Hash Functions to try out

    0 讨论(0)
  • 2021-01-30 00:11

    If you're on a posix alike system and sticking to plain C, I would simply use what the system already has to offer. man 3 hcreate offers you all details or you can find an online version here http://linux.die.net/man/3/hcreate

    0 讨论(0)
  • 2021-01-30 00:26

    Try Adler32 for long strings or Murmur2 for short strings.

    0 讨论(0)
提交回复
热议问题