Are we safe to use floats as loop-counters and to increment/decrement them by fractional amounts at each iteration,like in the seemingly risk-free program below?Of course I know that using floats as operands for the == operator is a dumb thing to do.But what's wrong with using floats as operands for other comparison operations for "normal" purposes? By "normal" I mean that,well,even though floats may not be the exact numerical representation of the number,but isn't a variation like 0.000000001
irrelevant and can be ignored in most cases? (For example in the following program that isn't even apparent)
But that said, here is my apprehension.Suppose the representation isn't exact and 5.0 is actually 4.999999.So as we go on decrementing by 0.5 at each iteration,the last comparison with 0 may turn out false and the loop may exit due to a difference of 0.000001,and the last line of current output will not be displayed. I hope you are getting my drift.How wrong am I?
#include<stdio.h>
int main(void)
{
float f;
for(f=5.0;f>=0;f-=0.5)
printf("%f\n",f);
}
Output:
5.000000
4.500000
4.000000
3.500000
3.000000
2.500000
2.000000
1.500000
1.000000
0.500000
0.000000
No, it's not safe, for the reasons given in your very question. Consider this:
#include<stdio.h>
int main(void) {
float f = 1.0;
for(;f>0;f-=0.1)
printf("%f\n",f);
return 0;
}
This example seems to work quite ok when f
is initialized by 1.0
. But change this to 3.0 - and things start to get way more interesting pretty soon:
2.600000
2.500000
2.400001
...
0.000001
... leading to the infamous 'off-by-one' failure.
You think that you might be safe with >=
instead of >
? Think again:
float f = 5.0;
for(;f>=1;f-=0.4)
printf("%f\n",f);
...
3.400000
3.000000
2.599999
2.199999
1.799999
1.399999
... and off-by-one we go again (as 0.99999
is less than 1).
As long as the starting value, the decrement amount and the result of all the decrements can be represented with no error within the precision provided by the floating point type, then it is safe to use. Note that "no error" here means 0 absolute error, very small error is still consider an error.
In your case, the starting value 5.0
and the decrement amount 0.5
can be represented with no error, and 4.5
, 4.0
, 3.5
, ..., 0.0
can also be represented with no error within 23-bit precision of float
. It is safe in your case.
If let's say the starting value is 4000000.0
and the decrement amount is 0.00390625
(2-8), then you are in trouble, because the result of the decrement cannot be represented without error in 23-bit precision of float
type, although the starting value and the decrement amount can be correctly represented.
However, I see no point in using floating point, when integral type is more reliable in such case. You don't have to waste brain cell in checking whether the condition I stated above applies or not.
Prefer integer values over floating point whenever possible simply because of the issues with floating point representation.
Instead of using the floating point number as your loop control, rework your logic to use integers:
Need to decrement your counter by .5
? Double your starting value and decrement by 1:
float f = 5.0;
int i = f * 2;
for(; i >= 0; i--)
printf("%f\n", i / 2.0);
Need to decrement by .1
?
float f = 5.0;
int i = f * 10;
for(; i >= 0; i--)
printf("%f\n", i / 10.0);
This is a simple approach for the example in the question. Certainly not the only approach or the most correct. A more complex example may require reworking the logic a bit different. Whatever fits the situation.
My point I suppose is to hold off working with the actual floating point value until the last possible moment to reduce introduction of errors due to representation.
Engineers and scientists frequently write iterative programs in which a floating point value steps through a range of values in small increments.
For example, suppose the "time" variable needs to change from a low of tMin to a high of tMax
in steps of deltaT
, where all these variables are doubles.
The obvious BUT INCORRECT approach is as follows:
`for( time = tMin; time <= tMax; time += deltaT ) {
// Use the time variable in the loop
}
`
So why is this so wrong?
If deltaT is small and/or the range is large ( or both ), the loop may execute for thousands of iterations.
That means that by the end of the loop, time
has been calculated by the summation of thousands of addition operations.
Numbers that seem "exact" to us in decimal form, such as 0.01 are not exact when the computer stores them in binary, which means that the value used for deltaT
is really an approximation to the exact value.
Therefore each addition step introduces a very small amount of roundoff error, and by the time you add up thousands of these errors, the total error can be significant.
The correct approach is as follows, if you know the minimum and maximum values and the desired change on each iteration:
`int nTimes = ( tMax - tMin ) / deltaT + 1;
for( int i = 0; i < nTimes; i++ ) {
time = tMin + i * deltaT;
}
// NOW use a more accurate time variable
// Or alternatively if you know the minimum, maximum, and number of desired iterations:
double deltaT = ( tMax - tMin ) / ( nTimes - 1 );
for( int i = 0; i < nTimes; i++ ) {
time = tMin + i * deltaT;
// NOW use a more accurate time variable
}
`
In general there are four values that can be used to specify stepping through a range - the low end of the range, the high end of the range, the number of step to take, and the increment to take on each step - and if you know any three of them, then you can calculate the fourth one.
The correct loop should use an integer counter to complete the loop a given number of times, and use the low end of the range and the increment as shown to calculate the floating point loop variable at the beginning of each iteration of the loop. So why is that better?
The number of times that the loop executes is now controlled by an integer, which does not have any roundoff error upon incrementation, so there is no chance of performing one too many or one too few iterations due to accumulated roundoff.
The time variable is now calculated from a single multiplication and a single addition, which can still introduce some roundoff error, but far less than thousands of additions. Where does that +1 come from?
The +1 is needed in order to include both endpoints of the range.
Suppose tMax
were 20 and tMin were 10, and deltaT
were 2.
The desired times would be 10, 12, 14, 16, 18, 20, which is a total of 6 time values, not 5. ( Five intervals if you want to look at it that way. )
( 20 - 10 ) / 2
yields 5, so you have to add the extra 1 to get the correct number of times of 6.
Another way of looking at this is that if nTimes is the number of data points in the range, then nTimes - 1
is the number of gaps between the data points.
Example: interpolate.c is a quick-and-dirty example of interpolating floating point numbers in a loop, that was whipped up in 10 minutes in class. It is NOT an example of good code, but it is an example of how a quick little program can be used to test out, play with, or in this case demonstrate a new or unfamiliar concept.
This example interpolates the function f( x ) = x^3
over the range from -1.0
to 4.0
in steps of 0.5
, using three approachs:
Constant - Take the average of the inputs at the endpoints, evaluate f( average input ), and assume the function is constant over the range.
Linear - Evaluate the function at the endpoints, and then use a linear interpolation of the endpoint function values in between.
Non-Linear - Linearly interpolate the function inputs over the range, and at each evaluation point, evaluate the function of the interpolated inputs.
来源:https://stackoverflow.com/questions/16595668/any-risk-of-using-float-variables-as-loop-counters-and-their-fractional-incremen