I have a structure in C#:
public struct UserInfo
{
public string str1
{
get;
set;
}
public string str2
{
get;
set;
}
See Jon Skeet's answer - binary operations like ^
are not good, they will often generate colliding hash!
Ah yes, as Gary Shutler pointed out:
return str1.GetHashCode() + str2.GetHashCode();
Can overflow. You could try casting to long as Artem suggested, or you could surround the statement in the unchecked keyword:
return unchecked(str1.GetHashCode() + str2.GetHashCode());
GetHashCode's result is supposed to be:
Bearing those in mind, I would go with something like this:
if (str1 == null)
if (str2 == null)
return 0;
else
return str2.GetHashCode();
else
if (str2 == null)
return str1.GetHashCode();
else
return ((ulong)str1.GetHashCode() | ((ulong)str2.GetHashCode() << 32)).GetHashCode();
Edit: Forgot the nulls. Code fixed.
public override int GetHashCode()
{
unchecked
{
return (str1 ?? String.Empty).GetHashCode() +
(str2 ?? String.Empty).GetHashCode();
}
}
Using the '+' operator might be better than using '^', because although you explicitly want ('AA', 'BB') and ('BB', 'AA') to explicitly be the same, you may not want ('AA', 'AA') and ('BB', 'BB') to be the same (or all equal pairs for that matter).
The 'as fast as possible' rule is not entirely adhered to in this solution because in the case of nulls this performs a 'GetHashCode()' on the empty string rather than immediately return a known constant, but even without explicitly measuring I am willing to hazard a guess that the difference wouldn't be big enough to worry about unless you expect a lot of nulls.
A simple general way is to do this:
return string.Format("{0}/{1}", str1, str2).GetHashCode();
Unless you have strict performance requirements, this is the easiest I can think of and I frequently use this method when I need a composite key. It handles the null
cases just fine and won't cause (m)any hash collisions (in general). If you expect '/' in your strings, just choose another separator that you don't expect.
Too complicated, and forgets nulls, etc. This is used for things like bucketing, so you can get away with something like
if (null != str1) {
return str1.GetHashCode();
}
if (null != str2) {
return str2.GetHashCode();
}
//Not sure what you would put here, some constant value will do
return 0;
This is biased by assuming that str1 is not likely to be common in an unusually large proportion of instances.