prevent long running averaging from overflow?

六月ゝ 毕业季﹏ 提交于 2019-12-05 17:43:56

Aggregated buckets.

We pick a bucket size that's comfortably less than squareRoot(MAXINT). To keep it simple, let's pick 10.

Each new value is added to the current bucket, and the moving average can be computed as you describe.

When the bucket is full start a new bucket, remembering the average of the full bucket. We can safely calculate the overall average by combining the averages of the full buckets and the current, partial bucket. When we get to 10 full buckets, we create a bigger bucket, capacity 100.

To compute the total average we first compute the average of the "10s" and then combine that with the "100s". This pattern repeats for "1,000s" "10,000s" and so on. At each stage we only need to consider two levels one 10 x bigger than the previous one.

Use double total; unsigned long long count;. You should still worry about accuracy, but it will be much less of a problem than with float.

What about using Arbitrary-precision arithmetic ?

There's a list of libraries you could use on Wikipedia: http://en.wikipedia.org/wiki/Bignum#Libraries

Most of Arbitrary-precision arithmetic libraries will not overflow until the number of digits stored fill the available memory (which is quite unlikely).

You want to use kahan's summation algorithm:

http://en.wikipedia.org/wiki/Kahan_summation_algorithm

See also the section about errors in summation in "What Every Computer Scientist Should Know About Floating-Point Arithmetic"

http://docs.sun.com/source/806-3568/ncg_goldberg.html#1262

You could use these special datatypes where integeres can grow infinitely until your RAM is full.

I was just thinking about this also. I think this solution works in terms of the new value 'moving the needle'. It only moves it by a factor of the number of previous values that contributed to the average-so-far (plus 1 for itself). It will lose accuracy as the inputs grow but on average should be practically acceptable. Here's some Java code that seems to work. I used floats and ints here to demonstrate that it will work with those limitations but you could use double to gain accuracy. This is just to give you an idea of how to average an array of near-max integers. You would need to keep track of the total number of inputs and the current average, but not the total sum of the inputs. If your total number of inputs approaches MAX_INT, this eventually won't work and you should use the bucket suggestion above, but that is pretty drastic in most cases.

    public float calcAverageContinuous(int[] integers)
{
    float ave = 0;
    for (int i = 0; i < integers.length; i++) {
        ave += (((float)integers[i] - ave) / (float)(i + 1));
    }
    return ave;
}
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!