Numpy: What is special about division by 0.5?

后端未结

关注

 2  1418

-上瘾入骨i 2021-02-05 01:21

This answer of @Dunes states, that due to pipeline-ing there is (almost) no difference between floating-point multiplication and division. However, from my expience with other l

2条回答

南笙 (楼主)

2021-02-05 02:03

Intel CPUs have special optimizations when dividing by powers of two. See, for example, http://www.agner.org/optimize/instruction_tables.pdf, where it states

FDIV latency depends on precision specified in control word: 64 bits precision gives latency 38, 53 bits precision gives latency 32, 24 bits precision gives latency 18. Division by a power of 2 takes 9 clocks.

Although this applies to FDIV and not DIVPD (as @RalphVersteegen's answer notes), it would be quite surprising if DIVPD did not also implement this optimization.

Division is normally a very slow affair. However, a division by a power of two is just an exponent shift, and the mantissa usually doesn't need to change. This makes the operation very fast. Furthermore, it's easy to detect a power of two in floating-point representation as the mantissa will be all zeros (with an implicit leading 1), so this optimization is both easy to test for and cheap to implement.

0 讨论(0)

查看其它2个回答
发布评论:

提交评论
- 加载中...