Mathematical explanation why Decimal's conversion to Double is broken and Decimal.GetHashCode separates equal instances

前端 未结 2 638
萌比男神i
萌比男神i 2021-02-12 09:33

I am not sure if this non-standard way of stating a Stack Overflow question is good or bad, but here goes:

What is the best (mathematical or otherwise technical) explana

2条回答
  •  栀梦
    栀梦 (楼主)
    2021-02-12 09:54

    In Decimal.cs, we can see that GetHashCode() is implemented as native code. Furthermore, we can see that the cast to double is implemented as a call to ToDouble(), which in turn is implemented as native code. So from there, we can't see a logical explanation for the behaviour.

    In the old Shared Source CLI, we can find old implementations of these methods that hopefully sheds some light, if they haven't changed too much. We can find in comdecimal.cpp:

    FCIMPL1(INT32, COMDecimal::GetHashCode, DECIMAL *d)
    {
        WRAPPER_CONTRACT;
        STATIC_CONTRACT_SO_TOLERANT;
    
        ENSURE_OLEAUT32_LOADED();
    
        _ASSERTE(d != NULL);
        double dbl;
        VarR8FromDec(d, &dbl);
        if (dbl == 0.0) {
            // Ensure 0 and -0 have the same hash code
            return 0;
        }
        return ((int *)&dbl)[0] ^ ((int *)&dbl)[1];
    }
    FCIMPLEND
    

    and

    FCIMPL1(double, COMDecimal::ToDouble, DECIMAL d)
    {
        WRAPPER_CONTRACT;
        STATIC_CONTRACT_SO_TOLERANT;
    
        ENSURE_OLEAUT32_LOADED();
    
        double result;
        VarR8FromDec(&d, &result);
        return result;
    }
    FCIMPLEND
    

    We can see that the the GetHashCode() implementation is based on the conversion to double: the hash code is based on the bytes that result after a conversion to double. It is based on the assumption that equal decimal values convert to equal double values.

    So let's test the VarR8FromDec system call outside of .NET:

    In Delphi (I'm actually using FreePascal), here's a short program to call the system functions directly to test their behaviour:

    {$MODE Delphi}
    program Test;
    uses
      Windows,
      SysUtils,
      Variants;
    type
      Decimal = TVarData;
    function VarDecFromStr(const strIn: WideString; lcid: LCID; dwFlags: ULONG): Decimal; safecall; external 'oleaut32.dll';
    function VarDecAdd(const decLeft, decRight: Decimal): Decimal; safecall; external 'oleaut32.dll';
    function VarDecSub(const decLeft, decRight: Decimal): Decimal; safecall; external 'oleaut32.dll';
    function VarDecDiv(const decLeft, decRight: Decimal): Decimal; safecall; external 'oleaut32.dll';
    function VarBstrFromDec(const decIn: Decimal; lcid: LCID; dwFlags: ULONG): WideString; safecall; external 'oleaut32.dll';
    function VarR8FromDec(const decIn: Decimal): Double; safecall; external 'oleaut32.dll';
    var
      Zero, One, Ten, FortyTwo, Fraction: Decimal;
      I: Integer;
    begin
      try
        Zero := VarDecFromStr('0', 0, 0);
        One := VarDecFromStr('1', 0, 0);
        Ten := VarDecFromStr('10', 0, 0);
        FortyTwo := VarDecFromStr('42', 0, 0);
        Fraction := One;
        for I := 1 to 40 do
        begin
          FortyTwo := VarDecSub(VarDecAdd(FortyTwo, Fraction), Fraction);
          Fraction := VarDecDiv(Fraction, Ten);
          Write(I: 2, ': ');
          if VarR8FromDec(FortyTwo) = 42 then WriteLn('ok') else WriteLn('not ok');
        end;
      except on E: Exception do
        WriteLn(E.Message);
      end;
    end.
    

    Note that since Delphi and FreePascal have no language support for any floating-point decimal type, I'm calling system functions to perform the calculations. I'm setting FortyTwo first to 42. I then add 1 and subtract 1. I then add 0.1 and subtract 0.1. Et cetera. This causes the precision of the decimal to be extended the same way in .NET.

    And here's (part of) the output:

    ...
    20: ok
    21: ok
    22: not ok
    23: ok
    24: not ok
    25: ok
    26: ok
    ...
    

    Thus showing that this is indeed a long-standing problem in Windows that merely happens to be exposed by .NET. It's system functions that are giving different results for equal decimal values, and either they should be fixed, or .NET should be changed to not use defective functions.

    Now, in the new .NET Core, we can see in its decimal.cpp code to work around the problem:

    FCIMPL1(INT32, COMDecimal::GetHashCode, DECIMAL *d)
    {
        FCALL_CONTRACT;
    
        ENSURE_OLEAUT32_LOADED();
    
        _ASSERTE(d != NULL);
        double dbl;
        VarR8FromDec(d, &dbl);
        if (dbl == 0.0) {
            // Ensure 0 and -0 have the same hash code
            return 0;
        }
        // conversion to double is lossy and produces rounding errors so we mask off the lowest 4 bits
        // 
        // For example these two numerically equal decimals with different internal representations produce
        // slightly different results when converted to double:
        //
        // decimal a = new decimal(new int[] { 0x76969696, 0x2fdd49fa, 0x409783ff, 0x00160000 });
        //                     => (decimal)1999021.176470588235294117647000000000 => (double)1999021.176470588
        // decimal b = new decimal(new int[] { 0x3f0f0f0f, 0x1e62edcc, 0x06758d33, 0x00150000 }); 
        //                     => (decimal)1999021.176470588235294117647000000000 => (double)1999021.1764705882
        //
        return ((((int *)&dbl)[0]) & 0xFFFFFFF0) ^ ((int *)&dbl)[1];
    }
    FCIMPLEND
    

    This appears to be implemented in the current .NET Framework too, based on the fact that one of the wrong double values does give the same hash code, but it's not enough to completely fix the problem.

提交回复
热议问题