Associativity math: (a + b) + c != a + (b + c)

问题

Recently I was going through an old blog post by Eric Lippert in which, while writing about associativity he mentions that in C#, (a + b) + c is not equivalent to a + (b + c) for certain values of a, b, c.

I am not able to figure out for what types and range of arithmetic values might that hold true and why.

回答1:

On the range of the double type:

double dbl1 = (double.MinValue + double.MaxValue) + double.MaxValue;
double dbl2 = double.MinValue + (double.MaxValue + double.MaxValue);

The first one is double.MaxValue, the second one is double.Infinity

On the precision of the double type:

double dbl1 = (double.MinValue + double.MaxValue) + double.Epsilon;
double dbl2 = double.MinValue + (double.MaxValue + double.Epsilon);

Now dbl1 == double.Epsilon, while dbl2 == 0.

And on literally reading the question :-)

In checked mode:

checked
{
    int i1 = (int.MinValue + int.MaxValue) + int.MaxValue;
}

i1 is int.MaxValue

checked
{
    int temp = int.MaxValue;
    int i2 = int.MinValue + (temp + temp);
}

(note the use of the temp variable, otherwise the compiler will give an error directly... Technically even this would be a different result :-) Compiles correctly vs doesn't compile)

this throws an OverflowException... The results are different :-) (int.MaxValue vs Exception)

回答2:

one example

a = 1e-30
b = 1e+30
c = -1e+30

回答3:

Extending on the other answers which show how with extremes of small and large numbers you get a different result, here's an example where floating point with realistic normal numbers gives you a different answer.

In this case, instead of using numbers at the extreme limits of precision I simply do a lot of additions. The difference is between doing (((...(((a+b)+c)+d)+e)... or ...(((a+b)+(c+d))+((e+f)+(g+h)))+...

I'm using python here, but you will probably get the same results if you write this in C#. First create a list of a million values, all of which are 0.1. Add them up from the left and you see the rounding errors become significant:

>>> numbers = [0.1]*1000000
>>> sum(numbers)
100000.00000133288

Now add them again, but this time add them in pairs (there are much more efficient ways to do this that use less intermediate storage, but I kept the implementation simple here):

>>> def pair_sum(numbers):
    if len(numbers)==1:
        return numbers[0]
    if len(numbers)%2:
        numbers.append(0)
    return pair_sum([a+b for a,b in zip(numbers[::2], numbers[1::2])])

>>> pair_sum(numbers)
100000.0

This time any rounding errors are minimised.

Edit for completeness, here's a more efficient but less easy to follow implementation of a pairwise sum. It gives the same answer as the pair_sum() above:

def pair_sum(seq):
    tmp = []
    for i,v in enumerate(seq):
        if i&1:
            tmp[-1] = tmp[-1] + v
            i = i + 1
            n = i & -i
            while n > 2:
                t = tmp.pop(-1)
                tmp[-1] = tmp[-1] + t
                n >>= 1
        else:
            tmp.append(v)
    while len(tmp) > 1:
        t = tmp.pop(-1)
        tmp[-1] = tmp[-1] + t
    return tmp[0]

And here's the simple pair_sum written in C#:

using System;
using System.Linq;

namespace ConsoleApplication1
{
    class Program
    {
        static double pair_sum(double[] numbers)
        {
            if (numbers.Length==1)
            {
                return numbers[0];
            }
            var new_numbers = new double[(numbers.Length + 1) / 2];
            for (var i = 0; i < numbers.Length - 1; i += 2) {
                new_numbers[i / 2] = numbers[i] + numbers[i + 1];
            }
            if (numbers.Length%2 != 0)
            {
                new_numbers[new_numbers.Length - 1] = numbers[numbers.Length-1];
            }
            return pair_sum(new_numbers);
        }
        static void Main(string[] args)
        {
            var numbers = new double[1000000];
            for (var i = 0; i < numbers.Length; i++) numbers[i] = 0.1;
            Console.WriteLine(numbers.Sum());
            Console.WriteLine(pair_sum(numbers));
        }
    }
}

with output:

100000.000001333
100000

回答4:

This stems from the fact that ordinary value types (int, long, etc.) are stored using a fixed amount of bytes. Overflow is thus possible, when the sum of two values exceeds the byte storage capacity.

In C#, one may use BigInteger to avoid this kind of issue. BigInteger's are arbitrary in size and therefore do not create overflows.

BigInteger is only available from .NET 4.0 and above (VS 2010+).

回答5:

Short answer is (a + b) + c == a + (b + c) mathematically, but not necessarily computationally.

Remembering that computers really work in binary, even simple decimals can result in roundoff errors when converted to internal format.

Depending on the language, even addition can incur roundoff errors, and in the above example, the roundoff error in a+b may differ from that in b+c.

One surprising offender is JavaScript: 0.1 + 0.2 != 0.3. The roundoff error is a long way down the decimal, but real and problematic.

It’s a general principal that you reduce roundoff error by adding the small parts first. This way they can accumulate before being overwhelmed by the larger number.

回答6:

A couple of similar examples:

static void A(string s, int i, int j)
{
  var test1 = (s + i) + j;
  var test2 = s + (i + j);
  var testX = s + i + j;
}

Here A("Hello", 3, 5) leads to test1 and testX being equal to "Hello35", while test2 will be "Hello8".

And:

static void B(int i, int j, long k)
{
  var test1 = (i + j) + k;
  var test2 = i + (j + k);
  var testX = i + j + k;
}

Here B(2000000000, 2000000000, 42L) leads to test1 and testX being equal to -294967254L in usual unchecked mode, while test2 becomes 4000000042L.

来源：https://stackoverflow.com/questions/32004190/associativity-math-a-b-c-a-b-c

标签

.net

math

numeric