Can Java's hashCode produce same value for different strings?

前端未结

关注

 12  822

Is it possible to have same hashcode for different strings using java\'s hashcode function?or if it is possible then what is the % of its possibility?

相关标签:

12条回答

一向

2020-11-28 09:31

if it is possible then what is the % of its possibility?

That is not a particularly meaningful question.

However, unless there is some systemic bias in the String::hashcode function or the way that you are generating the String objects, the probability that any two different (non-equal) String objects will have the same hash code will be 1 in 2³².

This assumes that the Strings are chosen randomly from the set of all possible String values. If you restrict the set in various ways, the probability will vary from the above number. (For instance, the existence of the "FB" / "Ea" collision means that the probability of a collision in the set of all 2 letter strings is higher than the norm.)

Another thing to note is that the chance of 2³² different strings chosen at random (from a much larger unbiased set of strings) having no hash collisions is vanishingly small. To understand why, read the Wikipedia page on the Birthday Paradox.

In reality, the only way you are going to get no hash collisions in a set of 2³² different strings is if you select or generate the strings. Even forming the set by selecting randomly generated strings is going to be computationally expensive. To produce such a set efficiently, you would need to exploit the properties of the String::hashCode algorithm, which (fortunately) is specified.

0 讨论(0)
发布评论:

提交评论
- 加载中...
死守一世寂寞

2020-11-28 09:31
Yes this is possible, because one of the contract between equals() & hashCode() method of Object class is.......... If two object are not equal according to equals() method then there is no guaranty that their hashCode will be same, the hashCode may/may not be equal. i.e, if obj1.equals(obj2) return false then obj1.hashCode()==obj2.hashCode() may/may not return true. Example:
```
    String str1 = "FB";
    String str2 = "Ea";
    System.out.println(str1.equals(str2));// false
    System.out.println(str1.hashCode() == str2.hashCode()); // true
```
0 讨论(0)
发布评论:

提交评论
- 加载中...
甜味超标

2020-11-28 09:36
Yes, it is entirely possible. The probability of a string (or some other object type -- just assuming you'll be using strings in this example) having the same hashcode as some other string in a collection, depends on the size of that collection (assuming that all strings in that collection are unique). The probabilities are distributed as follows:
- With a set of size ~9,000, you'll have a 1% chance of two strings colliding with a hash in the set
- With a set of size ~30,000, you'll have a 10% chance of two strings colliding with a hash in the set
- With a set of size ~77,000, you'll have a 50% chance of two strings colliding with a hash in the set
The assumptions made, are:
- The hashCode function has no bias
- Each string in the aforementioned set is unique
This site explains it clearly: http://eclipsesource.com/blogs/2012/09/04/the-3-things-you-should-know-about-hashcode/ (Look at "the second thing you should know")
0 讨论(0)
发布评论:

提交评论
- 加载中...
再見小時候

2020-11-28 09:43

A Java hash code is 32bits. The number of possible strings it hashes is infinite.

So yes, there will be collisions. The percentage is meaningless - there is an infinite number of items (strings) and a finite number of possible hashes.

0 讨论(0)
发布评论:

提交评论
- 加载中...
暗喜

2020-11-28 09:45
```
"tensada".hashCode()
"friabili".hashCode());
```
The java hash function return equal values here.
0 讨论(0)
发布评论:

提交评论
- 加载中...

陌清茗

2020-11-28 09:50

This wouldn't directly answer your question, but I hope it helps.

The below is from the source code of java.lang.String.

/**
 * Returns a hash code for this string. The hash code for a
 * <code>String</code> object is computed as
 * <blockquote><pre>
 * s[0]*31^(n-1) + s[1]*31^(n-2) + ... + s[n-1]
 * </pre></blockquote>
 * using <code>int</code> arithmetic, where <code>s[i]</code> is the
 * <i>i</i>th character of the string, <code>n</code> is the length of
 * the string, and <code>^</code> indicates exponentiation.
 * (The hash value of the empty string is zero.)
 *
 * @return  a hash code value for this object.
 */
public int hashCode() {
    int h = hash;
    int len = count;
    if (h == 0 && len > 0) {
    int off = offset;
    char val[] = value;

        for (int i = 0; i < len; i++) {
            h = 31*h + val[off++];
        }
        hash = h;
    }
    return h;
}

0 讨论(0)

1 2 下一页