Cast performance from size_t to double

后端 未结 2 1759
半阙折子戏
半阙折子戏 2021-02-05 03:29

TL;DR: Why is multiplying/casting data in size_t slow and why does this vary per platform?

I\'m having some performance issues that I don\'

2条回答
  •  伪装坚强ぢ
    2021-02-05 03:46

    For your original questions:

    1. The code is slow because it involves the conversion from integer to float data types. That's why it's easily sped up when you use also an integer datatype for the sum-variables because it doesn't require a float-conversion anymore.
    2. The difference is the result of several factors. For example it depends on how efficient a platform is able to perform an int->float conversion. Furthermore this conversion could also mess up processor-internal optimizations in the program flow and prediction engine, caches, ... and also the internal parallelizing-features of the processors can have a huge influence in such calculations.

    For the additional questions:

    • "Surprisingly int is faster than uint_fast32_t"? What's the sizeof(size_t) and sizeof(int) on your platform? One guess I can make is, that both are probably 64bit and therefore a cast to 32bit not only can give you calculation errors but also includes a different-size-casting penalty.

    In general try to avoid visible and hidden casts as good as possible if these aren't really necessary. For example try to find out what real datatype is hidden behind "size_t" on your environment (gcc) and use that one for the loop-variable. In your example the square of uint's cannot be a float datatype so it makes no sense to use double here. Stick to integer types to achieve maximum performance.

提交回复
热议问题