What is the algorithm used by the memberwise equality test in .NET structs? I would like to know this so that I can use it as the basis for my own algorithm.
I am trying
This is the implementation of ValueType.Equals
from the Shared Source Common Language Infrastructure (version 2.0).
public override bool Equals (Object obj) {
BCLDebug.Perf(false, "ValueType::Equals is not fast. "+
this.GetType().FullName+" should override Equals(Object)");
if (null==obj) {
return false;
}
RuntimeType thisType = (RuntimeType)this.GetType();
RuntimeType thatType = (RuntimeType)obj.GetType();
if (thatType!=thisType) {
return false;
}
Object thisObj = (Object)this;
Object thisResult, thatResult;
// if there are no GC references in this object we can avoid reflection
// and do a fast memcmp
if (CanCompareBits(this))
return FastEqualsCheck(thisObj, obj);
FieldInfo[] thisFields = thisType.GetFields(
BindingFlags.Instance | BindingFlags.Public | BindingFlags.NonPublic);
for (int i=0; i<thisFields.Length; i++) {
thisResult = ((RtFieldInfo)thisFields[i])
.InternalGetValue(thisObj, false);
thatResult = ((RtFieldInfo)thisFields[i])
.InternalGetValue(obj, false);
if (thisResult == null) {
if (thatResult != null)
return false;
}
else
if (!thisResult.Equals(thatResult)) {
return false;
}
}
return true;
}
It's interesting to note that this is pretty much exactly the code that is shown in Reflector. That suprised me because I thought that the SSCLI was just a reference implementation, not the final library. Then again, I suppose there is a limited number of ways to implement this relatively simple algorithm.
The parts that I wanted to understand more are the calls to CanCompareBits
and FastEqualsCheck
. These are both implemented as native methods but their code is also included in the SSCLI. As you can see from the implementations below, the CLI looks at the definition of the object's class (via it's method table) to see if it contains pointers to reference types and how the memory for the object is laid out. If there are no references and the object is contiguous, then the memory is compared directly using the C function memcmp
.
// Return true if the valuetype does not contain pointer and is tightly packed
FCIMPL1(FC_BOOL_RET, ValueTypeHelper::CanCompareBits, Object* obj)
{
WRAPPER_CONTRACT;
STATIC_CONTRACT_SO_TOLERANT;
_ASSERTE(obj != NULL);
MethodTable* mt = obj->GetMethodTable();
FC_RETURN_BOOL(!mt->ContainsPointers() && !mt->IsNotTightlyPacked());
}
FCIMPLEND
FCIMPL2(FC_BOOL_RET, ValueTypeHelper::FastEqualsCheck, Object* obj1,
Object* obj2)
{
WRAPPER_CONTRACT;
STATIC_CONTRACT_SO_TOLERANT;
_ASSERTE(obj1 != NULL);
_ASSERTE(obj2 != NULL);
_ASSERTE(!obj1->GetMethodTable()->ContainsPointers());
_ASSERTE(obj1->GetSize() == obj2->GetSize());
TypeHandle pTh = obj1->GetTypeHandle();
FC_RETURN_BOOL(memcmp(obj1->GetData(),obj2->GetData(),pTh.GetSize()) == 0);
}
FCIMPLEND
If I wasn't quite so lazy, I might look into the implementation of ContainsPointers
and IsNotTightlyPacked
. However, I've definitively find out what I wanted to know (and I am lazy) so that's a job for another day.
public static bool CompareMembers<T>(this T source, T other, params Expression<Func<object>>[] propertiesToSkip)
{
PropertyInfo[] sourceProperties = source.GetType().GetProperties();
List<string> propertiesToSkipList = (from x in propertiesToSkip
let a = x.Body as MemberExpression
let b = x.Body as UnaryExpression
select a == null ? ((MemberExpression)b.Operand).Member.Name : a.Member.Name).ToList();
List<PropertyInfo> lstProperties = (
from propertyToSkip in propertiesToSkipList
from property in sourceProperties
where property.Name != propertyToSkip
select property).ToList();
return (!(lstProperties.Any(property => !property.GetValue(source, null).Equals(property.GetValue(other, null)))));
}
How to use:
bool test = myObj1.MemberwiseEqual(myObj2,
() => myObj.Id,
() => myObj.Name);
There is no default memberwise equality, but for the base value types (float
, byte
, decimal
etc), the language spec demands bitwise comparison. The JIT optimizer optimizes this to the proper assembly instructions, but technically this behavior is equal to the C memcmp
function.
DateTime
simply compares its internal InternalTicks
member field, which is a long;PointF
compares X and Y as in (left.X == right.X) && (left.Y == right.Y)
;Decimal
does not compare internal fields but falls back to InternalImpl, which means, it's in the internal unviewable .NET part (but you can check SSCLI);Rectangle
explicitly compares each field (x, y, width, height);ModuleHandle
uses its Equals
override and there are many more that do this;SqlString
and other SqlXXX structs uses its IComparable.Compare
implementation;Guid
is the weirdest in this list: it has its own short-circuit long list of if-statements comparing each and every internal field (_a
to _k
, all int) for inequality, returning false when unequal. If all are not unequal, it returns true.This list is rather arbitrary, but I hope it shines some light on the issue: there's no default method available, and even the BCL uses a different approach for each struct, depending on its purpose. The bottom line seems to be that later additions more frequently call their Equals
override or Icomparable.Compare
, but that merely moves the issue to another method.
You can use reflection to go through each field, but this is very slow. You can also create a single extension method or static helper that does a bitwise compare on the internal fields. Use StructLayout.Sequential
, take the memory address and the size, and compare the memory blocks. This requires unsafe code, but it is quick, easy (and a bit dirty).
Update: rephrasing, added some actual examples, added new conclusion
The above was apparently a slight misunderstanding of the question, but I leave it there since I think it contains some value for future visitors regardless. Here's a more to the point answer:
Here's an implementation of a memberwise compare for objects and value types alike, that can go through all properties, fields and enumerable contents, recursively no matter how deep. It is not tested, probably contains some typos, but it compiles alright. See comments in code for more details:
public static bool MemberCompare(object left, object right)
{
if (Object.ReferenceEquals(left, right))
return true;
if (left == null || right == null)
return false;
Type type = left.GetType();
if (type != right.GetType())
return false;
if(left as ValueType != null)
{
// do a field comparison, or use the override if Equals is implemented:
return left.Equals(right);
}
// check for override:
if (type != typeof(object)
&& type == type.GetMethod("Equals").DeclaringType)
{
// the Equals method is overridden, use it:
return left.Equals(right);
}
// all Arrays, Lists, IEnumerable<> etc implement IEnumerable
if (left as IEnumerable != null)
{
IEnumerator rightEnumerator = (right as IEnumerable).GetEnumerator();
rightEnumerator.Reset();
foreach (object leftItem in left as IEnumerable)
{
// unequal amount of items
if (!rightEnumerator.MoveNext())
return false;
else
{
if (!MemberCompare(leftItem, rightEnumerator.Current))
return false;
}
}
}
else
{
// compare each property
foreach (PropertyInfo info in type.GetProperties(
BindingFlags.Public |
BindingFlags.NonPublic |
BindingFlags.Instance |
BindingFlags.GetProperty))
{
// TODO: need to special-case indexable properties
if (!MemberCompare(info.GetValue(left, null), info.GetValue(right, null)))
return false;
}
// compare each field
foreach (FieldInfo info in type.GetFields(
BindingFlags.GetField |
BindingFlags.NonPublic |
BindingFlags.Public |
BindingFlags.Instance))
{
if (!MemberCompare(info.GetValue(left), info.GetValue(right)))
return false;
}
}
return true;
}
Update: fixed a few errors, added use of overridden Equals
if and only if available
Update: object.Equals
should not be considered an override, fixed.
This is more complex than meets the eye. The short answer would be:
public bool MyEquals(object obj1, object obj2)
{
if(obj1==null || obj2==null)
return obj1==obj2;
else if(...)
... // Your custom code here
else if(obj1.GetType().IsValueType)
return
obj1.GetType()==obj2.GetType() &&
!struct1.GetType().GetFields(ALL_FIELDS).Any(field =>
!MyEquals(field.GetValue(struct1), field.GetValue(struct2)));
else
return object.Equals(obj1, obj2);
}
const BindingFlags ALL_FIELDS =
BindingFlags.Instance |
BindingFlags.Public |
BindingFlags.NonPublic;
However there is much more to it than that. Here are the details:
If you declare a struct and don't override .Equals(), NET Framework will use one of two different strategies depending on whether your struct has only "simple" value types ("simple" is defined below):
If the struct contains only "simple" value types, a bitwise comparison is done, basically:
strncmp((byte*)&struct1, (byte*)&struct2, Marshal.Sizeof(struct1));
If the struct contains references or non-"simple" value types, each declared field is compared as with object.Equals():
struct1.GetType()==struct2.GetType() &&
!struct1.GetType().GetFields(ALL_FIELDS).Any(field =>
!object.Equals(field.GetValue(struct1), field.GetValue(struct2)));
What qualifies as a "simple" type? From my tests it appears to be any basic scalar type (int, long, decimal, double, etc), plus any struct that doesn't have a .Equals override and contains only "simple" types (recursively).
This has some interesting ramifications. For example, in this code:
struct DoubleStruct
{
public double value;
}
public void TestDouble()
{
var test1 = new DoubleStruct { value = 1 / double.PositiveInfinity };
var test2 = new DoubleStruct { value = 1 / double.NegativeInfinity };
bool valueEqual = test1.value.Equals(test2.value);
bool structEqual = test1.Equals(test2);
MessageBox.Show("valueEqual=" + valueEqual + ", structEqual=" + structEqual);
}
you would expect valueEqual to always be identical to structEqual, no matter what was assigned to test1.value and test2.value. This is not the case!
The reason for this surprising result is that double.Equals() takes into account some of the intricacies of the IEEE 754 encoding such as multiple NaN and zero representations, but a bitwise comparison does not. Because "double" is considered a simple type, the structEqual returns false when the bits are different, even when valueEqual returns true.
The above example used alternate zero representations, but this can also occur with multiple NaN values:
...
var test1 = new DoubleStruct { value = CreateNaN(1) };
var test2 = new DoubleStruct { value = CreateNaN(2) };
...
public unsafe double CreateNaN(byte lowByte)
{
double result = double.NaN;
((byte*)&result)[0] = lowByte;
return result;
}
In most ordinary situations this won't make a difference, but it is something to be aware of.
Here's my own attempt at this problem. It works, but I'm not convinced I've covered all the bases.
public class MemberwiseEqualityComparer : IEqualityComparer
{
public bool Equals(object x, object y)
{
// ----------------------------------------------------------------
// 1. If exactly one is null, return false.
// 2. If they are the same reference, then they must be equal by
// definition.
// 3. If the objects are both IEnumerable, return the result of
// comparing each item.
// 4. If the objects are equatable, return the result of comparing
// them.
// 5. If the objects are different types, return false.
// 6. Iterate over the public properties and compare them. If there
// is a pair that are not equal, return false.
// 7. Return true.
// ----------------------------------------------------------------
//
// 1. If exactly one is null, return false.
//
if (null == x ^ null == y) return false;
//
// 2. If they are the same reference, then they must be equal by
// definition.
//
if (object.ReferenceEquals(x, y)) return true;
//
// 3. If the objects are both IEnumerable, return the result of
// comparing each item.
// For collections, we want to compare the contents rather than
// the properties of the collection itself so we check if the
// classes are IEnumerable instances before we check to see that
// they are the same type.
//
if (x is IEnumerable && y is IEnumerable && false == x is string)
{
return contentsAreEqual((IEnumerable)x, (IEnumerable)y);
}
//
// 4. If the objects are equatable, return the result of comparing
// them.
// We are assuming that the type of X implements IEquatable<> of itself
// (see below) which is true for the numeric types and string.
// e.g.: public class TypeOfX : IEquatable<TypeOfX> { ... }
//
var xType = x.GetType();
var yType = y.GetType();
var equatableType = typeof(IEquatable<>).MakeGenericType(xType);
if (equatableType.IsAssignableFrom(xType)
&& xType.IsAssignableFrom(yType))
{
return equatablesAreEqual(equatableType, x, y);
}
//
// 5. If the objects are different types, return false.
//
if (xType != yType) return false;
//
// 6. Iterate over the public properties and compare them. If there
// is a pair that are not equal, return false.
//
if (false == propertiesAndFieldsAreEqual(x, y)) return false;
//
// 7. Return true.
//
return true;
}
public int GetHashCode(object obj)
{
return null != obj ? obj.GetHashCode() : 0;
}
private bool contentsAreEqual(IEnumerable enumX, IEnumerable enumY)
{
var enumOfObjX = enumX.OfType<object>();
var enumOfObjY = enumY.OfType<object>();
if (enumOfObjX.Count() != enumOfObjY.Count()) return false;
var contentsAreEqual = enumOfObjX
.Zip(enumOfObjY) // Custom Zip extension which returns
// Pair<TFirst,TSecond>. Similar to .NET 4's Zip
// extension.
.All(pair => Equals(pair.First, pair.Second))
;
return contentsAreEqual;
}
private bool equatablesAreEqual(Type equatableType, object x, object y)
{
var equalsMethod = equatableType.GetMethod("Equals");
var equal = (bool)equalsMethod.Invoke(x, new[] { y });
return equal;
}
private bool propertiesAndFieldsAreEqual(object x, object y)
{
const BindingFlags bindingFlags
= BindingFlags.Public | BindingFlags.Instance;
var propertyValues = from pi in x.GetType()
.GetProperties(bindingFlags)
.AsQueryable()
where pi.CanRead
select new
{
Name = pi.Name,
XValue = pi.GetValue(x, null),
YValue = pi.GetValue(y, null),
};
var fieldValues = from fi in x.GetType()
.GetFields(bindingFlags)
.AsQueryable()
select new
{
Name = fi.Name,
XValue = fi.GetValue(x),
YValue = fi.GetValue(y),
};
var propertiesAreEqual = propertyValues.Union(fieldValues)
.All(v => Equals(v.XValue, v.YValue))
;
return propertiesAreEqual;
}
}