I use Kahan summation algorithm:
inline void KahanSum(float value, float & sum, float & correction)
{
float term = value - correction;
fl
I suppouse, that it's a result of aggresive compiler optimization. So GCC can reduce the expression from:
float term = value - correction;
float temp = sum + term;
correction = (temp - sum) - term;
sum = temp;
to
float term = value - correction;
correction = 0;
sum += term;
because this transformation is mathematically correct, but this optimization kills Kahan algorithm.
In order to avoid this problem you can use "-O1" GCC compiler options to compile the code. It will be something like this:
#if defined(__GNUC__)
# pragma GCC push_options
# pragma GCC optimize ("O1")
#endif
inline void KahanSum(float value, float & sum, float & correction)
{
float term = value - correction;
float temp = sum + term;
correction = (temp - sum) - term;
sum = temp;
}
float KahanSum(const float * ptr, size_t size)
{
float sum = 0, correction = 0;
for(size_t i = 0; i < size; ++i)
KahanSum(ptr[i], sum, correction);
return sum;
}
#if defined(__GNUC__)
# pragma GCC pop_options
#endif