How to actually avoid floating point errors when you need to use float?

前端未结

关注

 4  923

攒了一身酷

I am trying to affect the translation of a 3D model using some UI buttons to shift the position by 0.1 or -0.1.

My model position is a three dimensional float so sim

相关标签:

4条回答

南方客

2021-01-13 18:48

The Kahan summation and pairwise summation algorithms help to reduce floating point errors. Here's some Java code for the Kahan algorithm.

0 讨论(0)
发布评论:

提交评论
- 加载中...
滥情空心

2021-01-13 18:49

I would use a Rational class. There are many out there - this one looks like it should work.

One significant cost will be when the Rational is rendered into a float and one when the denominator is reduced to the gcd. The one I posted keeps the numerator and denominator in fully reduced state at all times which should be quite efficient if you are always adding or subtracting 1/10.

This implementation holds the values normalised (i.e. with consistent sign) but unreduced.

You should choose your implementation to best fit your usage.

0 讨论(0)
发布评论:

提交评论
- 加载中...
無奈伤痛

2021-01-13 18:49
A simple solution is to either use fixed precision. i.e. an integer 10x or 100x what you want.
```
float f = 10;
f += 0.1f;
```
becomes
```
int i = 100;
i += 1;  // use an many times as you like
// use i / 10.0 as required.
```
I wouldn't use float in any case as you get more rounding errors than double for next to no benefit (unless you have millions of float values) double gives you 8 more digits of precision and with sensible rounding would won't see those errors.
0 讨论(0)
发布评论:

提交评论
- 加载中...
小鲜肉

2021-01-13 18:49

If you stick with floats: The easiest way to avoid the error is using floats which are exact, but near the desired value which is

round(2^n * value) * 1/2^n.

n is the number of bits, value the number to use (in your case 0.1)

In your case with increasing precision:

n = 4 => 0.125
n = 8 (byte) => 0.9765625
n = 16 (short)=> 0.100006103516....

The long number chains are artefacts of the binary conversion, the real number has much less bits.

As the floats are exact, addition and subtraction will not introduce offset errors, but will always be predictable as long as the number of bits is not longer than the float value holds.

If you fear that your display will be compromised by using this solution (because they are odd floats), use and store only integers (step increase -1/1). The final value which is internally set is

x = value * step.

As the step increases or decreases by an amount of 1, precision will be retained.

0 讨论(0)
发布评论:

提交评论
- 加载中...