问题
Consider the following code:
struct Vec2 : IEquatable<Vec2>
{
double X,Y;
public bool Equals(Vec2 other)
{
return X.Equals(other.X) && Y.Equals(other.Y);
}
public override bool Equals(object obj)
{
if (obj is Vec2)
{
return Equals((Vec2)obj);
}
return false;
}
// this will return the same value when X, Y are swapped
public override int GetHashCode()
{
return X.GetHashCode() ^ Y.GetHashCode();
}
}
Beyond the conversation of comparing doubles for equality (this is just demo code), what I am concerned with is that there is a hash clash when X, Y values are swapped. For example:
Vec2 A = new Vec2() { X=1, Y=5 };
Vec2 B = new Vec2() { X=5, Y=1 };
bool test1 = A.Equals(B); // returns false;
bool test2 = A.GetHashCode() == B.GetHashCode() // returns true !!!!!
which should wreck havoc in a dictionary collection. So the question is how to property form the GetHashCode()
function for 2,3 or even 4 floating point values such that the results are not symmetric and the hashes don't clash.
Edit 1:
Point
implements the inappropriate x ^ y
solution, and PointF
wraps ValueType.GetHashCode()
.
Rectangle
has a very peculiar (((X ^ ((Y << 13) | (Y >> 19))) ^ ((Width << 26) | (Width >> 6))) ^ ((Height << 7) | (Height >> 25)))
expression for the hash code, which seems to perform as expected.
Edit 2:
'System.Double' has a nice implementation as it does not consider each bit equally important
public override unsafe int GetHashCode() //from System.Double
{
double num = this;
if (num == 0.0)
{
return 0;
}
long num2 = *((long*) &num);
return (((int) num2) ^ ((int) (num2 >> 32)));
}
回答1:
Jon skeet has this covered:
What is the best algorithm for an overridden System.Object.GetHashCode?
public override int GetHashCode()
{
unchecked // Overflow is fine, just wrap
{
int hash = 17;
// Suitable nullity checks etc, of course :)
hash = hash * 23 + X.GetHashCode();
hash = hash * 23 + Y.GetHashCode();
return hash;
}
}
Also, change your Equals(object)
implementation to:
return Equals(obj as FVector2);
Note however that this could perceive a derived type to be equal. If you don't want that, you'd have to compare the runtime type other.GetType()
with typeof(FVector2)
(and don't forget nullity checks) Thanks for pointing out it's a struct, LukH
Resharper has nice code generation for equality and hash code, so if you have resharper you can let it do its thing
回答2:
Hash collisions don't wreak havoc in a dictionary collection. They'll reduce the efficiency if you're unlucky enough to get them, but dictionaries have to cope with them.
Collisions should be rare if at all possible, but they're don't mean the implementation is incorrect. XORs are often bad for the reasons you've given (high collisions) - ohadsc has posted a sample I gave before for an alternative, which should be fine.
Note that it would be impossible to implement Vec2
with no collisions - there are only 232 possible return values from GetHashCode
, but there are rather more possible X and Y values, even after you've removed NaN and infinite values...
Eric Lippert has a recent blog post on GetHashCode
which you may find useful.
回答3:
What are reasonable bounds for the coordinates?
Unless it can be all possible integer values you could simply:
const SOME_LARGE_NUMBER=100000; return SOME_LARGE_NUMBER * x + y;
回答4:
If size of your hash code is lesser than size of your struct, then clashes are inevitable anyways.
回答5:
The hash codes approach works for interger coordinates but is not recommended for floating point values. With floating point coordinates one can create a point-set/pool by using a sorted sequence structure.
A sorted sequence is a leaf version balanced binary tree.
Here the keys would be the point coordinates.
来源:https://stackoverflow.com/questions/5221396/what-is-an-appropriate-gethashcode-algorithm-for-a-2d-point-struct-avoiding