What is the algorithm used by the memberwise equality test in .NET structs?

人走茶凉 提交于 2020-01-22 10:54:08

问题


What is the algorithm used by the memberwise equality test in .NET structs? I would like to know this so that I can use it as the basis for my own algorithm.

I am trying to write a recursive memberwise equality test for arbitrary objects (in C#) for testing the logical equality of DTOs. This is considerably easier if the DTOs are structs (since ValueType.Equals does mostly the right thing) but that is not always appropriate. I would also like to override comparison of any IEnumerable objects (but not strings!) so that their contents are compared rather than their properties.

This has proven to be harder than I would expect. Any hints will be greatly appreciated. I'll accept the answer that proves most useful or supplies a link to the most useful information.

Thanks.


回答1:


There is no default memberwise equality, but for the base value types (float, byte, decimal etc), the language spec demands bitwise comparison. The JIT optimizer optimizes this to the proper assembly instructions, but technically this behavior is equal to the C memcmp function.

Some BCL examples

  • DateTime simply compares its internal InternalTicks member field, which is a long;
  • PointF compares X and Y as in (left.X == right.X) && (left.Y == right.Y);
  • Decimal does not compare internal fields but falls back to InternalImpl, which means, it's in the internal unviewable .NET part (but you can check SSCLI);
  • Rectangle explicitly compares each field (x, y, width, height);
  • ModuleHandle uses its Equals override and there are many more that do this;
  • SqlString and other SqlXXX structs uses its IComparable.Compare implementation;
  • Guid is the weirdest in this list: it has its own short-circuit long list of if-statements comparing each and every internal field (_a to _k, all int) for inequality, returning false when unequal. If all are not unequal, it returns true.

Conclusion

This list is rather arbitrary, but I hope it shines some light on the issue: there's no default method available, and even the BCL uses a different approach for each struct, depending on its purpose. The bottom line seems to be that later additions more frequently call their Equals override or Icomparable.Compare, but that merely moves the issue to another method.

Other ways:

You can use reflection to go through each field, but this is very slow. You can also create a single extension method or static helper that does a bitwise compare on the internal fields. Use StructLayout.Sequential, take the memory address and the size, and compare the memory blocks. This requires unsafe code, but it is quick, easy (and a bit dirty).

Update: rephrasing, added some actual examples, added new conclusion


Update: implementation of memberwise compare

The above was apparently a slight misunderstanding of the question, but I leave it there since I think it contains some value for future visitors regardless. Here's a more to the point answer:

Here's an implementation of a memberwise compare for objects and value types alike, that can go through all properties, fields and enumerable contents, recursively no matter how deep. It is not tested, probably contains some typos, but it compiles alright. See comments in code for more details:

public static bool MemberCompare(object left, object right)
{
    if (Object.ReferenceEquals(left, right))
        return true;

    if (left == null || right == null)
        return false;

    Type type = left.GetType();
    if (type != right.GetType())
        return false;

    if(left as ValueType != null)
    {
        // do a field comparison, or use the override if Equals is implemented:
        return left.Equals(right);
    }

    // check for override:
    if (type != typeof(object)
        && type == type.GetMethod("Equals").DeclaringType)
    {
        // the Equals method is overridden, use it:
        return left.Equals(right);
    }

    // all Arrays, Lists, IEnumerable<> etc implement IEnumerable
    if (left as IEnumerable != null)
    {
        IEnumerator rightEnumerator = (right as IEnumerable).GetEnumerator();
        rightEnumerator.Reset();
        foreach (object leftItem in left as IEnumerable)
        {
            // unequal amount of items
            if (!rightEnumerator.MoveNext())
                return false;
            else
            {
                if (!MemberCompare(leftItem, rightEnumerator.Current))
                    return false;
            }                    
        }
    }
    else
    {
        // compare each property
        foreach (PropertyInfo info in type.GetProperties(
            BindingFlags.Public | 
            BindingFlags.NonPublic | 
            BindingFlags.Instance | 
            BindingFlags.GetProperty))
        {
            // TODO: need to special-case indexable properties
            if (!MemberCompare(info.GetValue(left, null), info.GetValue(right, null)))
                return false;
        }

        // compare each field
        foreach (FieldInfo info in type.GetFields(
            BindingFlags.GetField |
            BindingFlags.NonPublic |
            BindingFlags.Public |
            BindingFlags.Instance))
        {
            if (!MemberCompare(info.GetValue(left), info.GetValue(right)))
                return false;
        }
    }
    return true;
}

Update: fixed a few errors, added use of overridden Equals if and only if available
Update: object.Equals should not be considered an override, fixed.




回答2:


This is the implementation of ValueType.Equals from the Shared Source Common Language Infrastructure (version 2.0).

public override bool Equals (Object obj) {
    BCLDebug.Perf(false, "ValueType::Equals is not fast.  "+
        this.GetType().FullName+" should override Equals(Object)");
    if (null==obj) {
        return false;
    }
    RuntimeType thisType = (RuntimeType)this.GetType();
    RuntimeType thatType = (RuntimeType)obj.GetType();

    if (thatType!=thisType) {
        return false;
    }

    Object thisObj = (Object)this;
    Object thisResult, thatResult;

    // if there are no GC references in this object we can avoid reflection 
    // and do a fast memcmp
    if (CanCompareBits(this))
        return FastEqualsCheck(thisObj, obj);

    FieldInfo[] thisFields = thisType.GetFields(
        BindingFlags.Instance | BindingFlags.Public | BindingFlags.NonPublic);

    for (int i=0; i<thisFields.Length; i++) {
        thisResult = ((RtFieldInfo)thisFields[i])
            .InternalGetValue(thisObj, false);
        thatResult = ((RtFieldInfo)thisFields[i])
            .InternalGetValue(obj, false);

        if (thisResult == null) {
            if (thatResult != null)
                return false;
        }
        else
        if (!thisResult.Equals(thatResult)) {
            return false;
        }
    }

    return true;
}

It's interesting to note that this is pretty much exactly the code that is shown in Reflector. That suprised me because I thought that the SSCLI was just a reference implementation, not the final library. Then again, I suppose there is a limited number of ways to implement this relatively simple algorithm.

The parts that I wanted to understand more are the calls to CanCompareBits and FastEqualsCheck. These are both implemented as native methods but their code is also included in the SSCLI. As you can see from the implementations below, the CLI looks at the definition of the object's class (via it's method table) to see if it contains pointers to reference types and how the memory for the object is laid out. If there are no references and the object is contiguous, then the memory is compared directly using the C function memcmp.

// Return true if the valuetype does not contain pointer and is tightly packed
FCIMPL1(FC_BOOL_RET, ValueTypeHelper::CanCompareBits, Object* obj)
{
    WRAPPER_CONTRACT;
    STATIC_CONTRACT_SO_TOLERANT;

    _ASSERTE(obj != NULL);
    MethodTable* mt = obj->GetMethodTable();
    FC_RETURN_BOOL(!mt->ContainsPointers() && !mt->IsNotTightlyPacked());
}
FCIMPLEND

FCIMPL2(FC_BOOL_RET, ValueTypeHelper::FastEqualsCheck, Object* obj1,
    Object* obj2)
{
    WRAPPER_CONTRACT;
    STATIC_CONTRACT_SO_TOLERANT;

    _ASSERTE(obj1 != NULL);
    _ASSERTE(obj2 != NULL);
    _ASSERTE(!obj1->GetMethodTable()->ContainsPointers());
    _ASSERTE(obj1->GetSize() == obj2->GetSize());

    TypeHandle pTh = obj1->GetTypeHandle();

    FC_RETURN_BOOL(memcmp(obj1->GetData(),obj2->GetData(),pTh.GetSize()) == 0);
}
FCIMPLEND

If I wasn't quite so lazy, I might look into the implementation of ContainsPointers and IsNotTightlyPacked. However, I've definitively find out what I wanted to know (and I am lazy) so that's a job for another day.

  • Shared Source Common Language Infrastructure 2.0



回答3:


This is more complex than meets the eye. The short answer would be:

public bool MyEquals(object obj1, object obj2)
{
  if(obj1==null || obj2==null)
    return obj1==obj2;
  else if(...)
    ...  // Your custom code here
  else if(obj1.GetType().IsValueType)
    return
      obj1.GetType()==obj2.GetType() &&
      !struct1.GetType().GetFields(ALL_FIELDS).Any(field =>
       !MyEquals(field.GetValue(struct1), field.GetValue(struct2)));
  else
    return object.Equals(obj1, obj2);
}

const BindingFlags ALL_FIELDS =
  BindingFlags.Instance |
  BindingFlags.Public |
  BindingFlags.NonPublic;

However there is much more to it than that. Here are the details:

If you declare a struct and don't override .Equals(), NET Framework will use one of two different strategies depending on whether your struct has only "simple" value types ("simple" is defined below):

If the struct contains only "simple" value types, a bitwise comparison is done, basically:

strncmp((byte*)&struct1, (byte*)&struct2, Marshal.Sizeof(struct1));

If the struct contains references or non-"simple" value types, each declared field is compared as with object.Equals():

struct1.GetType()==struct2.GetType() &&
!struct1.GetType().GetFields(ALL_FIELDS).Any(field =>
  !object.Equals(field.GetValue(struct1), field.GetValue(struct2)));

What qualifies as a "simple" type? From my tests it appears to be any basic scalar type (int, long, decimal, double, etc), plus any struct that doesn't have a .Equals override and contains only "simple" types (recursively).

This has some interesting ramifications. For example, in this code:

struct DoubleStruct
{
  public double value;
}

public void TestDouble()
{
  var test1 = new DoubleStruct { value = 1 / double.PositiveInfinity };
  var test2 = new DoubleStruct { value = 1 / double.NegativeInfinity };

  bool valueEqual = test1.value.Equals(test2.value);
  bool structEqual = test1.Equals(test2);

  MessageBox.Show("valueEqual=" + valueEqual + ", structEqual=" + structEqual);
}

you would expect valueEqual to always be identical to structEqual, no matter what was assigned to test1.value and test2.value. This is not the case!

The reason for this surprising result is that double.Equals() takes into account some of the intricacies of the IEEE 754 encoding such as multiple NaN and zero representations, but a bitwise comparison does not. Because "double" is considered a simple type, the structEqual returns false when the bits are different, even when valueEqual returns true.

The above example used alternate zero representations, but this can also occur with multiple NaN values:

...
  var test1 = new DoubleStruct { value = CreateNaN(1) };
  var test2 = new DoubleStruct { value = CreateNaN(2) };
...
public unsafe double CreateNaN(byte lowByte)
{
  double result = double.NaN;
  ((byte*)&result)[0] = lowByte;
  return result;
}

In most ordinary situations this won't make a difference, but it is something to be aware of.




回答4:


Here's my own attempt at this problem. It works, but I'm not convinced I've covered all the bases.

public class MemberwiseEqualityComparer : IEqualityComparer
{
    public bool Equals(object x, object y)
    {
        // ----------------------------------------------------------------
        // 1. If exactly one is null, return false.
        // 2. If they are the same reference, then they must be equal by
        //    definition.
        // 3. If the objects are both IEnumerable, return the result of
        //    comparing each item.
        // 4. If the objects are equatable, return the result of comparing
        //    them.
        // 5. If the objects are different types, return false.
        // 6. Iterate over the public properties and compare them. If there
        //    is a pair that are not equal, return false.
        // 7. Return true.
        // ----------------------------------------------------------------

        //
        // 1. If exactly one is null, return false.
        //
        if (null == x ^ null == y) return false;

        //
        // 2. If they are the same reference, then they must be equal by
        //    definition.
        //
        if (object.ReferenceEquals(x, y)) return true;

        //
        // 3. If the objects are both IEnumerable, return the result of
        //    comparing each item.
        // For collections, we want to compare the contents rather than
        // the properties of the collection itself so we check if the
        // classes are IEnumerable instances before we check to see that
        // they are the same type.
        //
        if (x is IEnumerable && y is IEnumerable && false == x is string)
        {
            return contentsAreEqual((IEnumerable)x, (IEnumerable)y);
        }

        //
        // 4. If the objects are equatable, return the result of comparing
        //    them.
        // We are assuming that the type of X implements IEquatable<> of itself
        // (see below) which is true for the numeric types and string.
        // e.g.: public class TypeOfX : IEquatable<TypeOfX> { ... }
        //
        var xType = x.GetType();
        var yType = y.GetType();
        var equatableType = typeof(IEquatable<>).MakeGenericType(xType);
        if (equatableType.IsAssignableFrom(xType)
            && xType.IsAssignableFrom(yType))
        {
            return equatablesAreEqual(equatableType, x, y);
        }

        //
        // 5. If the objects are different types, return false.
        //
        if (xType != yType) return false;

        //
        // 6. Iterate over the public properties and compare them. If there
        //    is a pair that are not equal, return false.
        //
        if (false == propertiesAndFieldsAreEqual(x, y)) return false;

        //
        // 7. Return true.
        //
        return true;
    }

    public int GetHashCode(object obj)
    {
        return null != obj ? obj.GetHashCode() : 0;
    }

    private bool contentsAreEqual(IEnumerable enumX, IEnumerable enumY)
    {
        var enumOfObjX = enumX.OfType<object>();
        var enumOfObjY = enumY.OfType<object>();

        if (enumOfObjX.Count() != enumOfObjY.Count()) return false;

        var contentsAreEqual = enumOfObjX
            .Zip(enumOfObjY) // Custom Zip extension which returns
                             // Pair<TFirst,TSecond>. Similar to .NET 4's Zip
                             // extension.
            .All(pair => Equals(pair.First, pair.Second))
            ;

        return contentsAreEqual;
    }

    private bool equatablesAreEqual(Type equatableType, object x, object y)
    {
        var equalsMethod = equatableType.GetMethod("Equals");
        var equal = (bool)equalsMethod.Invoke(x, new[] { y });
        return equal;
    }

    private bool propertiesAndFieldsAreEqual(object x, object y)
    {
        const BindingFlags bindingFlags
            = BindingFlags.Public | BindingFlags.Instance;

        var propertyValues = from pi in x.GetType()
                                         .GetProperties(bindingFlags)
                                         .AsQueryable()
                             where pi.CanRead
                             select new
                             {
                                 Name   = pi.Name,
                                 XValue = pi.GetValue(x, null),
                                 YValue = pi.GetValue(y, null),
                             };

        var fieldValues = from fi in x.GetType()
                                      .GetFields(bindingFlags)
                                      .AsQueryable()
                          select new
                          {
                              Name   = fi.Name,
                              XValue = fi.GetValue(x),
                              YValue = fi.GetValue(y),
                          };

        var propertiesAreEqual = propertyValues.Union(fieldValues)
            .All(v => Equals(v.XValue, v.YValue))
            ;

        return propertiesAreEqual;
    }
}



回答5:


public static bool CompareMembers<T>(this T source, T other, params Expression<Func<object>>[] propertiesToSkip)
{
    PropertyInfo[] sourceProperties = source.GetType().GetProperties();

    List<string> propertiesToSkipList = (from x in propertiesToSkip
                                         let a = x.Body as MemberExpression
                                         let b = x.Body as UnaryExpression
                                         select a == null ? ((MemberExpression)b.Operand).Member.Name : a.Member.Name).ToList();

    List<PropertyInfo> lstProperties = (
        from propertyToSkip in propertiesToSkipList
        from property in sourceProperties
        where property.Name != propertyToSkip
        select property).ToList();

    return (!(lstProperties.Any(property => !property.GetValue(source, null).Equals(property.GetValue(other, null)))));
}

How to use:

bool test = myObj1.MemberwiseEqual(myObj2,
        () => myObj.Id,
        () => myObj.Name);


来源:https://stackoverflow.com/questions/1680602/what-is-the-algorithm-used-by-the-memberwise-equality-test-in-net-structs

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!