Best hashing algorithm in terms of hash collisions and performance for strings

前端 未结 9 1068
忘掉有多难
忘掉有多难 2020-11-28 03:37

What would be the best hashing algorithm if we had the following priorities (in that order):

  1. Minimal hash collisions
  2. Performance

It doe

相关标签:
9条回答
  • 2020-11-28 03:58

    You can get both using the Knuth hash function described here.

    It's extremely fast assuming a power-of-2 hash table size -- just one multiply, one shift, and one bit-and. More importantly (for you) it's great at minimizing collisions (see this analysis).

    Some other good algorithms are described here.

    0 讨论(0)
  • 2020-11-28 04:00

    Here is a straightforward way of implementing it yourself: http://www.devcodenote.com/2015/04/collision-free-string-hashing.html

    Here is a snippet from the post:

    if say we have a character set of capital English letters, then the length of the character set is 26 where A could be represented by the number 0, B by the number 1, C by the number 2 and so on till Z by the number 25. Now, whenever we want to map a string of this character set to a unique number , we perform the same conversion as we did in case of the binary format

    0 讨论(0)
  • 2020-11-28 04:01

    The simple hashCode used by Java's String class might show a suitable algorithm.

    Below is the "GNU Classpath" implementation. (License: GPL)

      /**
       * Computes the hashcode for this String. This is done with int arithmetic,
       * where ** represents exponentiation, by this formula:<br>
       * <code>s[0]*31**(n-1) + s[1]*31**(n-2) + ... + s[n-1]</code>.
       *
       * @return hashcode value of this String
       */
      public int hashCode()
      {
        if (cachedHashCode != 0)
          return cachedHashCode;
    
        // Compute the hash code using a local variable to be reentrant.
        int hashCode = 0;
        int limit = count + offset;
        for (int i = offset; i < limit; i++)
          hashCode = hashCode * 31 + value[i];
        return cachedHashCode = hashCode;
      }
    
    0 讨论(0)
提交回复
热议问题