Cast performance from size_t to double

后端未结

关注

 2  1760

半阙折子戏

TL;DR: Why is multiplying/casting data in size_t slow and why does this vary per platform?

I\'m having some performance issues that I don\'

相关标签:

2条回答

伪装坚强ぢ

2021-02-05 03:46
For your original questions:
1. The code is slow because it involves the conversion from integer to float data types. That's why it's easily sped up when you use also an integer datatype for the sum-variables because it doesn't require a float-conversion anymore.
2. The difference is the result of several factors. For example it depends on how efficient a platform is able to perform an int->float conversion. Furthermore this conversion could also mess up processor-internal optimizations in the program flow and prediction engine, caches, ... and also the internal parallelizing-features of the processors can have a huge influence in such calculations.
For the additional questions:
- "Surprisingly int is faster than uint_fast32_t"? What's the sizeof(size_t) and sizeof(int) on your platform? One guess I can make is, that both are probably 64bit and therefore a cast to 32bit not only can give you calculation errors but also includes a different-size-casting penalty.
In general try to avoid visible and hidden casts as good as possible if these aren't really necessary. For example try to find out what real datatype is hidden behind "size_t" on your environment (gcc) and use that one for the loop-variable. In your example the square of uint's cannot be a float datatype so it makes no sense to use double here. Stick to integer types to achieve maximum performance.
0 讨论(0)
发布评论:

提交评论
- 加载中...
天涯浪人

2021-02-05 03:56

On x86, the conversion of uint64_t to floating point is slower because there are only instructions to convert int64_t, int32_t and int16_t. int16_t and in 32-bit mode int64_t can only be converted using x87 instructions, not SSE.

When converting uint64_t to floating point, GCC 4.2.1 first converts the value as if it were an int64_t and then adds 2⁶⁴ if it was negative to compensate. (When using the x87, on Windows and *BSD or if you changed the precision control, beware that the conversion ignores precision control but the addition respects it.)

An uint32_t is first extended to int64_t.

When converting 64-bit integers in 32-bit mode on processors with certain 64-bit capabilities, a store-to-load forwarding issue may cause stalls. The 64-bit integer is written as two 32-bit values and read back as one 64-bit value. This can be very bad if the conversion is part of a long dependency chain (not in this case).

0 讨论(0)
发布评论:

提交评论
- 加载中...