double or float, which is faster? [duplicate]

后端未结

关注

 8  1708

粉色の甜心

相关标签:

8条回答

走了就别回头了

2020-11-28 09:06

Short answer is: it depends.

CPU with x87 will crunch floats and doubles equally fast. Vectorized code will run faster with floats, because SSE can crunch 4 floats or 2 doubles in one pass.

Another thing to consider is memory speed. Depending on your algorithm, your CPU could be idling a lot while waiting for the data. Memory intensive code will benefit from using floats, but ALU limited code won't (unless it is vectorized).

0 讨论(0)
发布评论:

提交评论
- 加载中...
南笙

2020-11-28 09:15

You can find a complete answer in this article:

What Every Computer Scientist Should Know About Floating-Point Arithmetic

This is a quote from a previous Stack Overflow thread, about how float and double variables affect memory bandwidth:

If a double requires more storage than a float, then it will take longer to read the data. That's the naive answer. On a modern IA32, it all depends on where the data is coming from. If it's in L1 cache, the load is negligible provided the data comes from a single cache line. If it spans more than one cache line there's a small overhead. If it's from L2, it takes a while longer, if it's in RAM then it's longer still and finally, if it's on disk it's a huge time. So the choice of float or double is less imporant than the way the data is used. If you want to do a small calculation on lots of sequential data, a small data type is preferable. Doing a lot of computation on a small data set would allow you to use bigger data types with any significant effect. If you're accessing the data very randomly, then the choice of data size is unimportant - data is loaded in pages / cache lines. So even if you only want a byte from RAM, you could get 32 bytes transfered (this is very dependant on the architecture of the system). On top of all of this, the CPU/FPU could be super-scalar (aka pipelined). So, even though a load may take several cycles, the CPU/FPU could be busy doing something else (a multiply for instance) that hides the load time to a degree

0 讨论(0)
发布评论:

提交评论
- 加载中...

上一页 1 2

热议问题