Creating a hash from several Java string objects

后端 未结 5 1262
情深已故
情深已故 2021-01-17 14:35

What would be the fastest and more robust (in terms of uniqueness) way for implementing a method like

public abstract String hash(String[] values);


        
相关标签:
5条回答
  • 2021-01-17 14:50

    It doesn't provide a 64 bit hash, but given the title of the question it's probably worth mentioning that since Java 1.7 there is java.util.Objects#hash(Object...).

    0 讨论(0)
  • 2021-01-17 14:56

    Definitely don't use plain addition due to its linearity properties, but you can modify your code just slightly to achieve very good dispersion.

    public String hash(String[] values) {
      long result = 17;
      for (String v:values) result = 37*result + v.hashCode();
      return String.valueOf(result);
    }
    
    0 讨论(0)
  • 2021-01-17 14:56

    First, hash code is typically numeric, e.g. int. Moreover your version of hash function create int and then makes its string representation that IMHO does not have any sense.

    I'd improve your hash method as following:

    public int hash(String[] values) {
        long result = 0;
       for (String v:values) {
            result = result * 31 + v.hashCode();
        }
        return result;
    }
    

    Take a look on hashCode() implemented in class java.lang.String

    0 讨论(0)
  • 2021-01-17 14:57

    You should watch out for creating weaknesses when combining methods. (The java hash function and your own). I did a little research on cascaded ciphers, and this is an example of it. (the addition might interfere with the internals of hashCode().

    The internals of hashCode() look like this:

            for (int i = 0; i < len; i++) {
                h = 31*h + val[off++];
            }
    

    so adding numbers together will cause the last characters of all strings in the array to just be added, which doesn't lower the randomness (this is already bad enough for a hash function).

    If you want real pseudorandomness, take a look at the FNV hash algorithm. It is the fastest hash algorithm out there that is especially designed for use in HashMaps.

    It goes like this:

        long hash = 0xCBF29CE484222325L;
        for(String s : strings)
        {
            hash ^= s.hashCode();
            hash *= 0x100000001B3L;
        }
    

    ^ This is not the actual implementation of FNV as it takes ints as input instead of bytes, but I think it works just as well.

    0 讨论(0)
  • 2021-01-17 14:59

    Here is the simple implementation using Objects class available from Java 7.

    @Override
    public int hashCode()
    {
        return Objects.hash(this.variable1, this.variable2);
    }
    
    0 讨论(0)
提交回复
热议问题