Why are some float < integer comparisons four times slower than others?

前端 未结 2 1696
庸人自扰
庸人自扰 2021-01-29 18:16

When comparing floats to integers, some pairs of values take much longer to be evaluated than other values of a similar magnitude.

For example:

>>&         


        
相关标签:
2条回答
  • 2021-01-29 18:39

    A comment in the Python source code for float objects acknowledges that:

    Comparison is pretty much a nightmare

    This is especially true when comparing a float to an integer, because, unlike floats, integers in Python can be arbitrarily large and are always exact. Trying to cast the integer to a float might lose precision and make the comparison inaccurate. Trying to cast the float to an integer is not going to work either because any fractional part will be lost.

    To get around this problem, Python performs a series of checks, returning the result if one of the checks succeeds. It compares the signs of the two values, then whether the integer is "too big" to be a float, then compares the exponent of the float to the length of the integer. If all of these checks fail, it is necessary to construct two new Python objects to compare in order to obtain the result.

    When comparing a float v to an integer/long w, the worst case is that:

    • v and w have the same sign (both positive or both negative),
    • the integer w has few enough bits that it can be held in the size_t type (typically 32 or 64 bits),
    • the integer w has at least 49 bits,
    • the exponent of the float v is the same as the number of bits in w.

    And this is exactly what we have for the values in the question:

    >>> import math
    >>> math.frexp(562949953420000.7) # gives the float's (significand, exponent) pair
    (0.9999999999976706, 49)
    >>> (562949953421000).bit_length()
    49
    

    We see that 49 is both the exponent of the float and the number of bits in the integer. Both numbers are positive and so the four criteria above are met.

    Choosing one of the values to be larger (or smaller) can change the number of bits of the integer, or the value of the exponent, and so Python is able to determine the result of the comparison without performing the expensive final check.

    This is specific to the CPython implementation of the language.


    The comparison in more detail

    The float_richcompare function handles the comparison between two values v and w.

    Below is a step-by-step description of the checks that the function performs. The comments in the Python source are actually very helpful when trying to understand what the function does, so I've left them in where relevant. I've also summarised these checks in a list at the foot of the answer.

    The main idea is to map the Python objects v and w to two appropriate C doubles, i and j, which can then be easily compared to give the correct result. Both Python 2 and Python 3 use the same ideas to do this (the former just handles int and long types separately).

    The first thing to do is check that v is definitely a Python float and map it to a C double i. Next the function looks at whether w is also a float and maps it to a C double j. This is the best case scenario for the function as all the other checks can be skipped. The function also checks to see whether v is inf or nan:

    static PyObject*
    float_richcompare(PyObject *v, PyObject *w, int op)
    {
        double i, j;
        int r = 0;
        assert(PyFloat_Check(v));       
        i = PyFloat_AS_DOUBLE(v);       
    
        if (PyFloat_Check(w))           
            j = PyFloat_AS_DOUBLE(w);   
    
        else if (!Py_IS_FINITE(i)) {
            if (PyLong_Check(w))
                j = 0.0;
            else
                goto Unimplemented;
        }
    

    Now we know that if w failed these checks, it is not a Python float. Now the function checks if it's a Python integer. If this is the case, the easiest test is to extract the sign of v and the sign of w (return 0 if zero, -1 if negative, 1 if positive). If the signs are different, this is all the information needed to return the result of the comparison:

        else if (PyLong_Check(w)) {
            int vsign = i == 0.0 ? 0 : i < 0.0 ? -1 : 1;
            int wsign = _PyLong_Sign(w);
            size_t nbits;
            int exponent;
    
            if (vsign != wsign) {
                /* Magnitudes are irrelevant -- the signs alone
                 * determine the outcome.
                 */
                i = (double)vsign;
                j = (double)wsign;
                goto Compare;
            }
        }   
    

    If this check failed, then v and w have the same sign.

    The next check counts the number of bits in the integer w. If it has too many bits then it can't possibly be held as a float and so must be larger in magnitude than the float v:

        nbits = _PyLong_NumBits(w);
        if (nbits == (size_t)-1 && PyErr_Occurred()) {
            /* This long is so large that size_t isn't big enough
             * to hold the # of bits.  Replace with little doubles
             * that give the same outcome -- w is so large that
             * its magnitude must exceed the magnitude of any
             * finite float.
             */
            PyErr_Clear();
            i = (double)vsign;
            assert(wsign != 0);
            j = wsign * 2.0;
            goto Compare;
        }
    

    On the other hand, if the integer w has 48 or fewer bits, it can safely turned in a C double j and compared:

        if (nbits <= 48) {
            j = PyLong_AsDouble(w);
            /* It's impossible that <= 48 bits overflowed. */
            assert(j != -1.0 || ! PyErr_Occurred());
            goto Compare;
        }
    

    From this point onwards, we know that w has 49 or more bits. It will be convenient to treat w as a positive integer, so change the sign and the comparison operator as necessary:

        if (nbits <= 48) {
            /* "Multiply both sides" by -1; this also swaps the
             * comparator.
             */
            i = -i;
            op = _Py_SwappedOp[op];
        }
    

    Now the function looks at the exponent of the float. Recall that a float can be written (ignoring sign) as significand * 2exponent and that the significand represents a number between 0.5 and 1:

        (void) frexp(i, &exponent);
        if (exponent < 0 || (size_t)exponent < nbits) {
            i = 1.0;
            j = 2.0;
            goto Compare;
        }
    

    This checks two things. If the exponent is less than 0 then the float is smaller than 1 (and so smaller in magnitude than any integer). Or, if the exponent is less than the number of bits in w then we have that v < |w| since significand * 2exponent is less than 2nbits.

    Failing these two checks, the function looks to see whether the exponent is greater than the number of bit in w. This shows that significand * 2exponent is greater than 2nbits and so v > |w|:

        if ((size_t)exponent > nbits) {
            i = 2.0;
            j = 1.0;
            goto Compare;
        }
    

    If this check did not succeed we know that the exponent of the float v is the same as the number of bits in the integer w.

    The only way that the two values can be compared now is to construct two new Python integers from v and w. The idea is to discard the fractional part of v, double the integer part, and then add one. w is also doubled and these two new Python objects can be compared to give the correct return value. Using an example with small values, 4.65 < 4 would be determined by the comparison (2*4)+1 == 9 < 8 == (2*4) (returning false).

        {
            double fracpart;
            double intpart;
            PyObject *result = NULL;
            PyObject *one = NULL;
            PyObject *vv = NULL;
            PyObject *ww = w;
    
            // snip
    
            fracpart = modf(i, &intpart); // split i (the double that v mapped to)
            vv = PyLong_FromDouble(intpart);
    
            // snip
    
            if (fracpart != 0.0) {
                /* Shift left, and or a 1 bit into vv
                 * to represent the lost fraction.
                 */
                PyObject *temp;
    
                one = PyLong_FromLong(1);
    
                temp = PyNumber_Lshift(ww, one); // left-shift doubles an integer
                ww = temp;
    
                temp = PyNumber_Lshift(vv, one);
                vv = temp;
    
                temp = PyNumber_Or(vv, one); // a doubled integer is even, so this adds 1
                vv = temp;
            }
            // snip
        }
    }
    

    For brevity I've left out the additional error-checking and garbage-tracking Python has to do when it creates these new objects. Needless to say, this adds additional overhead and explains why the values highlighted in the question are significantly slower to compare than others.


    Here is a summary of the checks that are performed by the comparison function.

    Let v be a float and cast it as a C double. Now, if w is also a float:

    • Check whether w is nan or inf. If so, handle this special case separately depending on the type of w.

    • If not, compare v and w directly by their representations as C doubles.

    If w is an integer:

    • Extract the signs of v and w. If they are different then we know v and w are different and which is the greater value.

    • (The signs are the same.) Check whether w has too many bits to be a float (more than size_t). If so, w has greater magnitude than v.

    • Check if w has 48 or fewer bits. If so, it can be safely cast to a C double without losing its precision and compared with v.

    • (w has more than 48 bits. We will now treat w as a positive integer having changed the compare op as appropriate.)

    • Consider the exponent of the float v. If the exponent is negative, then v is less than 1 and therefore less than any positive integer. Else, if the exponent is less than the number of bits in w then it must be less than w.

    • If the exponent of v is greater than the number of bits in w then v is greater than w.

    • (The exponent is the same as the number of bits in w.)

    • The final check. Split v into its integer and fractional parts. Double the integer part and add 1 to compensate for the fractional part. Now double the integer w. Compare these two new integers instead to get the result.

    0 讨论(0)
  • 2021-01-29 18:42

    Using gmpy2 with arbitrary precision floats and integers it is possible to get more uniform comparison performance:

    ~ $ ptipython
    Python 3.5.1 |Anaconda 4.0.0 (64-bit)| (default, Dec  7 2015, 11:16:01) 
    Type "copyright", "credits" or "license" for more information.
    
    IPython 4.1.2 -- An enhanced Interactive Python.
    ?         -> Introduction and overview of IPython's features.
    %quickref -> Quick reference.
    help      -> Python's own help system.
    object?   -> Details about 'object', use 'object??' for extra details.
    
    In [1]: import gmpy2
    
    In [2]: from gmpy2 import mpfr
    
    In [3]: from gmpy2 import mpz
    
    In [4]: gmpy2.get_context().precision=200
    
    In [5]: i1=562949953421000
    
    In [6]: i2=562949953422000
    
    In [7]: f=562949953420000.7
    
    In [8]: i11=mpz('562949953421000')
    
    In [9]: i12=mpz('562949953422000')
    
    In [10]: f1=mpfr('562949953420000.7')
    
    In [11]: f<i1
    Out[11]: True
    
    In [12]: f<i2
    Out[12]: True
    
    In [13]: f1<i11
    Out[13]: True
    
    In [14]: f1<i12
    Out[14]: True
    
    In [15]: %timeit f<i1
    The slowest run took 10.15 times longer than the fastest. This could mean that an intermediate result is being cached.
    1000000 loops, best of 3: 441 ns per loop
    
    In [16]: %timeit f<i2
    The slowest run took 12.55 times longer than the fastest. This could mean that an intermediate result is being cached.
    10000000 loops, best of 3: 152 ns per loop
    
    In [17]: %timeit f1<i11
    The slowest run took 32.04 times longer than the fastest. This could mean that an intermediate result is being cached.
    1000000 loops, best of 3: 269 ns per loop
    
    In [18]: %timeit f1<i12
    The slowest run took 36.81 times longer than the fastest. This could mean that an intermediate result is being cached.
    1000000 loops, best of 3: 231 ns per loop
    
    In [19]: %timeit f<i11
    The slowest run took 78.26 times longer than the fastest. This could mean that an intermediate result is being cached.
    10000000 loops, best of 3: 156 ns per loop
    
    In [20]: %timeit f<i12
    The slowest run took 21.24 times longer than the fastest. This could mean that an intermediate result is being cached.
    10000000 loops, best of 3: 194 ns per loop
    
    In [21]: %timeit f1<i1
    The slowest run took 37.61 times longer than the fastest. This could mean that an intermediate result is being cached.
    1000000 loops, best of 3: 275 ns per loop
    
    In [22]: %timeit f1<i2
    The slowest run took 39.03 times longer than the fastest. This could mean that an intermediate result is being cached.
    1000000 loops, best of 3: 259 ns per loop
    
    0 讨论(0)
提交回复
热议问题