I know rounding errors happen in floating point arithmetic but can somebody explain the reason for this one:
>>> 8.0 / 0.4 # as expected
20.0
>&
That's because there is no 0.4 in python (floating-point finite representation) it's actually a float like 0.4000000000000001
which makes the floor of division to be 19.
>>> floor(8//0.4000000000000001)
19.0
But the true division (/
) returns a reasonable approximation of the division result if the arguments are floats or complex. And that's why the result of 8.0/0.4
is 20. It actually depends on the size of arguments (in C double arguments). (not rounding to nearest float)
Read more about pythons integer division floors by Guido himself.
Also for complete information about the float numbers you can read this article https://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html
For those who have interest, the following function is the float_div
that does the true division task for float numbers, in Cpython's source code:
float_div(PyObject *v, PyObject *w)
{
double a,b;
CONVERT_TO_DOUBLE(v, a);
CONVERT_TO_DOUBLE(w, b);
if (b == 0.0) {
PyErr_SetString(PyExc_ZeroDivisionError,
"float division by zero");
return NULL;
}
PyFPE_START_PROTECT("divide", return 0)
a = a / b;
PyFPE_END_PROTECT(a)
return PyFloat_FromDouble(a);
}
Which the final result would be calculated by function PyFloat_FromDouble
:
PyFloat_FromDouble(double fval)
{
PyFloatObject *op = free_list;
if (op != NULL) {
free_list = (PyFloatObject *) Py_TYPE(op);
numfree--;
} else {
op = (PyFloatObject*) PyObject_MALLOC(sizeof(PyFloatObject));
if (!op)
return PyErr_NoMemory();
}
/* Inline PyObject_New */
(void)PyObject_INIT(op, &PyFloat_Type);
op->ob_fval = fval;
return (PyObject *) op;
}
Ok after a little bit of research I have found this issue.
What seems to be happening is, that as @khelwood suggested 0.4
evaluates internally to 0.40000000000000002220
, which when dividing 8.0
yields something slightly smaller than 20.0
. The /
operator then rounds to the nearest floating point number, which is 20.0
, but the //
operator immediately truncates the result, yielding 19.0
.
This should be faster and I suppose its "close to the processor", but I it still isn't what the user wants / is expecting.
@jotasi explained the true reason behind it.
However if you want to prevent it, you can use decimal
module which was basically designed to represent decimal floating point numbers exactly in contrast to binary floating point representation.
So in your case you could do something like:
>>> from decimal import *
>>> Decimal('8.0')//Decimal('0.4')
Decimal('20')
Reference: https://docs.python.org/2/library/decimal.html
As you and khelwood already noticed, 0.4
cannot be exactly represented as a float. Why? It is two fifth (4/10 == 2/5
) which does not have a finite binary fraction representation.
Try this:
from fractions import Fraction
Fraction('8.0') // Fraction('0.4')
# or equivalently
# Fraction(8, 1) // Fraction(2, 5)
# or
# Fraction('8/1') // Fraction('2/5')
# 20
However
Fraction('8') // Fraction(0.4)
# 19
Here, 0.4
is interpreted as a float literal (and thus a floating point binary number) which requires (binary) rounding, and only then converted to the rational number Fraction(3602879701896397, 9007199254740992)
, which is almost but not exactly 4 / 10. Then the floored division is executed, and because
19 * Fraction(3602879701896397, 9007199254740992) < 8.0
and
20 * Fraction(3602879701896397, 9007199254740992) > 8.0
the result is 19, not 20.
The same probably happens for
8.0 // 0.4
I.e., it seems floored division is determined atomically (but on the only approximate float values of the interpreted float literals).
So why does
floor(8.0 / 0.4)
give the "right" result? Because there, two rounding errors cancel each other out. First 1) the division is performed, yielding something slightly smaller than 20.0, but not representable as float. It gets rounded to the closest float, which happens to be 20.0
. Only then, the floor
operation is performed, but now acting on exactly 20.0
, thus not changing the number any more.
1) As Kyle Strand points out, that the exact result is determined then rounded isn't what actually happens low2)-level (CPython's C code or even CPU instructions). However, it can be a useful model for determining the expected 3) result.
2) On the lowest 4) level, however, this might not be too far off. Some chipsets determine float results by first computing a more precise (but still not exact, simply has some more binary digits) internal floating point result and then rounding to IEEE double precision.
3) "expected" by the Python specification, not necessarily by our intuition.
4) Well, lowest level above logic gates. We don't have to consider the quantum mechanics that make semiconductors possible to understand this.
After checking the semi-official sources of the float object in cpython on github (https://github.com/python/cpython/blob/966b24071af1b320a1c7646d33474eeae057c20f/Objects/floatobject.c) one can understand what happens here.
For normal division float_div
is called (line 560) which internally converts the python float
s to c-double
s, does the division and then converts the resulting double
back to a python float
. If you simply do that with 8.0/0.4
in c you get:
#include "stdio.h"
#include "math.h"
int main(){
double vx = 8.0;
double wx = 0.4;
printf("%lf\n", floor(vx/wx));
printf("%d\n", (int)(floor(vx/wx)));
}
// gives:
// 20.000000
// 20
For the floor division, something else happens. Internally, float_floor_div
(line 654) gets called, which then calls float_divmod
, a function that is supposed to return a tuple of python float
s containing the floored division, as well as the mod/remainder, even though the latter is just thrown away by PyTuple_GET_ITEM(t, 0)
. These values are computed the following way (After conversion to c-double
s):
double mod = fmod(numerator, denominator)
.mod
to get a integral value when you then do the division.floor((numerator - mod) / denominator)
(numerator - mod) / denominator
to the nearest integral value.The reason why this gives a different result is, that fmod(8.0, 0.4)
due to floating-point arithmetic gives 0.4
instead of 0.0
. Therefore, the result that is computed is actually floor((8.0 - 0.4) / 0.4) = 19
and snapping (8.0 - 0.4) / 0.4) = 19
to the nearest integral value does not fix the error made introduced by the "wrong" result of fmod
. You can easily chack that in c as well:
#include "stdio.h"
#include "math.h"
int main(){
double vx = 8.0;
double wx = 0.4;
double mod = fmod(vx, wx);
printf("%lf\n", mod);
double div = (vx-mod)/wx;
printf("%lf\n", div);
}
// gives:
// 0.4
// 19.000000
I would guess, that they chose this way of computing the floored division to keep the validity of (numerator//divisor)*divisor + fmod(numerator, divisor) = numerator
(as mentioned in the link in @0x539's answer), even though this now results in a somewhat unexpected behavior of floor(8.0/0.4) != 8.0//0.4
.