Numpy: What is special about division by 0.5?

后端 未结 2 1418
-上瘾入骨i
-上瘾入骨i 2021-02-05 01:21

This answer of @Dunes states, that due to pipeline-ing there is (almost) no difference between floating-point multiplication and division. However, from my expience with other l

2条回答
  •  南笙
    南笙 (楼主)
    2021-02-05 02:03

    Intel CPUs have special optimizations when dividing by powers of two. See, for example, http://www.agner.org/optimize/instruction_tables.pdf, where it states

    FDIV latency depends on precision specified in control word: 64 bits precision gives latency 38, 53 bits precision gives latency 32, 24 bits precision gives latency 18. Division by a power of 2 takes 9 clocks.

    Although this applies to FDIV and not DIVPD (as @RalphVersteegen's answer notes), it would be quite surprising if DIVPD did not also implement this optimization.


    Division is normally a very slow affair. However, a division by a power of two is just an exponent shift, and the mantissa usually doesn't need to change. This makes the operation very fast. Furthermore, it's easy to detect a power of two in floating-point representation as the mantissa will be all zeros (with an implicit leading 1), so this optimization is both easy to test for and cheap to implement.

提交回复
热议问题