Rounding to nearest int with numpy.rint() not consistent for .5

后端 未结 7 1383
余生分开走
余生分开走 2020-12-20 11:37

numpy\'s round int doesn\'t seem to be consistent with how it deals with xxx.5

In [2]: np.rint(1.5)
Out[2]: 2.0

In [3]: np.rint(10.5)
Out[3]: 10.0
<         


        
相关标签:
7条回答
  • 2020-12-20 12:07

    Numpy rounding does round towards even, but the other rounding modes can be expressed using a combination of operations.

    >>> a=np.arange(-4,5)*0.5
    >>> a
    array([-2. , -1.5, -1. , -0.5,  0. ,  0.5,  1. ,  1.5,  2. ])
    >>> np.floor(a)      # Towards -inf
    array([-2., -2., -1., -1.,  0.,  0.,  1.,  1.,  2.])
    >>> np.ceil(a)       # Towards +inf
    array([-2., -1., -1., -0.,  0.,  1.,  1.,  2.,  2.])
    >>> np.trunc(a)      # Towards 0
    array([-2., -1., -1., -0.,  0.,  0.,  1.,  1.,  2.])
    >>> a+np.copysign(0.5,a)   # Shift away from 0
    array([-2.5, -2. , -1.5, -1. ,  0.5,  1. ,  1.5,  2. ,  2.5])
    >>> np.trunc(a+np.copysign(0.5,a))   # 0.5 towards higher magnitude round
    array([-2., -2., -1., -1.,  0.,  1.,  1.,  2.,  2.])
    

    In general, numbers of the form n.5 can be accurately represented by binary floating point (they are m.1 in binary, as 0.5=2**-1), but calculations expected to reach them might not. For instance, negative powers of ten are not exactly represented:

    >>> (0.1).as_integer_ratio()
    (3602879701896397, 36028797018963968)
    >>> [10**n * 10**-n for n in range(20)]
    [1, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0,
     0.9999999999999999, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0]
    
    0 讨论(0)
  • 2020-12-20 12:08

    The built-in round function seems to do what you want, although it only works on scalars:

    def correct_round(x):
        try:
            y = [ round(z) for z in x ]
        except:
            y = round(x)    
        return y
    

    and then to verify:

    print correct_round([-2.5,-1.5,-0.5,0.5,1.5,2.5])
    > [-3.0, -2.0, -1.0, 1.0, 2.0, 3.0]
    
    0 讨论(0)
  • 2020-12-20 12:08

    Not sure its the most efficient solution but it works:

    signs = np.sign(arr)
    tmp = signs * arr
    arr = np.floor(tmp + 0.5)
    arr = arr * signs
    
    0 讨论(0)
  • 2020-12-20 12:11

    This is in fact exactly the rounding specified by the IEEE floating point standard IEEE 754 (1985 and 2008). It is intended to make rounding unbiased. In normal probability theory, a random number between two integers has zero probability of being exactly N + 0.5, so it shouldn't matter how you round it because that case never happens. But in real programs, numbers are not random and N + 0.5 occurs quite often. (In fact, you have to round 0.5 every time a floating point number loses 1 bit of precision!) If you always round 0.5 up to the next largest number, then the average of a bunch rounded numbers is likely to be slightly larger than the average of the unrounded numbers: this bias or drift can have very bad effects on some numerical algorithms and make them inaccurate.

    The reason rounding to even is better than rounding to odd is that the last digit is guaranteed to be zero, so if you have to divide by 2 and round again, you don't lose any information at all.

    In summary, this kind of rounding is the best that mathematicians have been able to devise, and you should WANT it under most circumstances. Now all we need to do is get schools to start teaching it to children.

    0 讨论(0)
  • 2020-12-20 12:14

    So, this kind of behavior (as noted in comments), is a very traditional form of rounding, seen in the round half to even method. Also known (according to David Heffernan) as banker's rounding. The numpy documentation around this behavior implies that they are using this type of rounding, but also implies that there may be issues with the way in which numpy interacts with the IEEE floating point format. (shown below)

    Notes
    -----
    For values exactly halfway between rounded decimal values, Numpy
    rounds to the nearest even value. Thus 1.5 and 2.5 round to 2.0,
    -0.5 and 0.5 round to 0.0, etc. Results may also be surprising due
    to the inexact representation of decimal fractions in the IEEE
    floating point standard [1]_ and errors introduced when scaling
    by powers of ten.
    

    Whether or not that is the case, I honestly don't know. I do know that large portions of the numpy core are still written in FORTRAN 77, which predates the IEEE standard (set in 1984), but I don't know enough FORTRAN 77 to say whether or not there's some issue with the interface here.

    If you're looking to just round up regardless, the np.ceil function (ceiling function in general), will do this. If you're looking for the opposite (always rounding down), the np.floor function will achieve this.

    0 讨论(0)
  • 2020-12-20 12:16

    An answer to you edit:

    y = int(np.floor(n + 0.5))
    
    0 讨论(0)
提交回复
热议问题