Why `float` function is slower than multiplying by 1.0?

孤者浪人 提交于 2019-12-30 03:43:10

问题


I understand that this could be argued as a non-issue, but I write software for HPC environments, so this 3.5x speed increase actually makes a difference.

In [1]: %timeit 10 / float(98765)            
1000000 loops, best of 3: 313 ns per loop

In [2]: %timeit 10 / (98765 * 1.0)
10000000 loops, best of 3: 80.6 ns per loop

I used dis to have a look at the code, and I assume float() will be slower as it requires a function call (unfortunately I couldn't dis.dis(float) to see what it's actually doing).

I guess a second question would be when should I use float(n) and when should I use n * 1.0?


回答1:


Because Peep hole optimizer optimizes it by precalculating the result of that multiplication

import dis
dis.dis(compile("10 / float(98765)", "<string>", "eval"))

  1           0 LOAD_CONST               0 (10)
              3 LOAD_NAME                0 (float)
              6 LOAD_CONST               1 (98765)
              9 CALL_FUNCTION            1
             12 BINARY_DIVIDE       
             13 RETURN_VALUE        

dis.dis(compile("10 / (98765 * 1.0)", "<string>", "eval"))

  1           0 LOAD_CONST               0 (10)
              3 LOAD_CONST               3 (98765.0)
              6 BINARY_DIVIDE       
              7 RETURN_VALUE        

It stores the result of 98765 * 1.0 in the byte code as a constant value. So, it just has to load it and divide, where as in the first case we have to call the function.

We can see that even more clearly like this

print compile("10 / (98765 * 1.0)", "<string>", "eval").co_consts
# (10, 98765, 1.0, 98765.0)

Since the value is pre-calculated during the compile time itself, second one is faster.

Edit: As pointed out by Davidmh in the comments,

And the reason why it is not also optimising away the division is because its behaviour depends on flags, like from __future__ import division and also because of -Q flag.

Quoting the comment from the actual peephole optimizer code for Python 2.7.9,

        /* Cannot fold this operation statically since
           the result can depend on the run-time presence
           of the -Qnew flag */


来源:https://stackoverflow.com/questions/22983625/why-float-function-is-slower-than-multiplying-by-1-0

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!