This answer of @Dunes states, that due to pipeline-ing there is (almost) no difference between floating-point multiplication and division. However, from my expience with other l
Intel CPUs have special optimizations when dividing by powers of two. See, for example, http://www.agner.org/optimize/instruction_tables.pdf, where it states
FDIV latency depends on precision specified in control word: 64 bits precision gives latency 38, 53 bits precision gives latency 32, 24 bits precision gives latency 18. Division by a power of 2 takes 9 clocks.
Although this applies to FDIV and not DIVPD (as @RalphVersteegen's answer notes), it would be quite surprising if DIVPD did not also implement this optimization.
Division is normally a very slow affair. However, a division by a power of two is just an exponent shift, and the mantissa usually doesn't need to change. This makes the operation very fast. Furthermore, it's easy to detect a power of two in floating-point representation as the mantissa will be all zeros (with an implicit leading 1), so this optimization is both easy to test for and cheap to implement.