Without doing anything special for a reference type, Equals()
would mean reference equality (i.e. same object). If I choose to override Equals()
f
You could provide multiple IEqualityComparer(T) implementations and let the consumer decide.
Example:
// Leave the class Equals as reference equality
class Person
{
readonly int Id;
string FirstName { get; set; }
string LastName { get; set; }
string Address { get; set; }
// ...
}
class PersonIdentityEqualityComparer : IEqualityComparer<Person>
{
public bool Equals(Person p1, Person p2)
{
if(p1 == null || p2 == null) return false;
return p1.Id == p2.Id;
}
public int GetHashCode(Person p)
{
return p.Id.GetHashCode();
}
}
class PersonValueEqualityComparer : IEqualityComparer<Person>
{
public bool Equals(Person p1, Person p2)
{
if(p1 == null || p2 == null) return false;
return p1.Id == p2.Id &&
p1.FirstName == p2.FirstName; // etc
}
public int GetHashCode(Person p)
{
int hash = 17;
hash = hash * 23 + p.Id.GetHashCode();
hash = hash * 23 + p.FirstName.GetHashCode();
// etc
return hash;
}
}
See also: What is the best algorithm for an overridden System.Object.GetHashCode?
Usage:
var personIdentityComparer = new PersonIdentityEqualityComparer();
var personValueComparer = new PersonValueEqualityComparer();
var joseph = new Person { Id = 1, FirstName = "Joseph" }
var persons = new List<Person>
{
new Person { Id = 1, FirstName = "Joe" },
new Person { Id = 2, FirstName = "Mary" },
joseph
};
var personsIdentity = new HashSet<Person>(persons, personIdentityComparer);
var personsValue = new HashSet<Person>(persons, personValueComparer);
var containsJoseph = personsIdentity.Contains(joseph);
Console.WriteLine(containsJoseph); // false;
containsJoseph = personsValue.Contains(joseph);
Console.WriteLine(containsJoseph); // true;
Fundamentally, if class-type fields (or variables, array slots, etc.) X
and Y
each hold a reference to a class object, there are two logical questions that (Object)X.Equals(Y)
can answer:
Note that if X
and Y
refer to objects of different types, neither function may legitimately return true unless both classes know that there cannot be any storage locations holding a reference to one which could not also hold a reference to the other [e.g. because both types are private classes derived from a common base, and neither is ever stored in any storage location (other than this
) whose type can't hold references to both].
The default Object.Equals
method answers the first question; ValueType.Equals
answers the second. The first question is generally the appropriate one to ask of object instances whose observable state may be mutated; the second is appropriate to ask of object instances whose observable state will not be mutated even if their types would allow it. If X
and Y
each hold a reference to a distinct int[1]
, and both arrays hold 23 in their first element, the first equality relation should define them as distinct [copying X
to Y
would alter the behavior of X[0]
if Y[0]
were modified], but the second should regard them as equivalent (swapping all references to the targets of X
and Y
wouldn't affect anything). Note that if the arrays held different values, the second test should regard the arrays as distinct, since swapping the objects would mean X[0]
would now report the value that Y[0]
used to report).
There's a pretty strong convention that mutable types (other than System.ValueType
and its descendants) should override Object.Equals
to implement the first type of equivalence relation; since it's impossible for System.ValueType
or its descendants to implement the first relation, they generally implement the second. Unfortunately, there's no standard convention by which objects which override Object.Equals()
for the first kind of relation should expose a method which tests for the second, even though an equivalence relation could be defined which allowed comparison between any two objects of any arbitrary type. The second relation would be useful in the standard pattern wherein an immutable class Imm
holds a private reference to a mutable type Mut
but doesn't expose that object to any code that could actually mutate it [making the instance immutable]. In such a case, there's no way for class Mut
to know that an instance will never be written, but it would be helpful to have a standard means by which two instances of Imm
could ask the Mut
s to which they hold references whether they would be equivalent if the holders of the references never mutated them. Note that the equivalence relation defined above makes no reference to mutation, nor to any particular means which Imm
must use to ensure that an instance won't be mutated, but its meaning is well-defined in any case. The object which holds a reference to Mut
should know whether that reference encapsulates identity, mutable state, or immutable state, and should thus be able to implement its own equality relation suitably.
Yes, deciding the right rules for this is tricky. There is no single "right" answer here, and it will depend a lot on both context and preference Personally, I rarely bother thinking about it much, just defaulting to reference equality on most regular POCO classes:
Person
as a dictionary-key / in a hash-set is minimal
int Id
as the key in a dictionary (etc) anywayx==y
gives the same result whether x
/y
are Person
or object
, or indeed T
in a generic methodEquals
and GetHashCode
are compatible, most things will just about work out, and one easy way to do that is to not override themNote, however, that I would always advise the opposite for value-types, i.e. explicitly override Equals
/ GetHashCode
; but then, writing a struct
is really uncommon