How to speed up floating-point to integer number conversion? [duplicate]

前端未结

关注

 16  1720

情话喂你

相关标签:

16条回答

不要未来只要你来

2020-12-04 18:42

You might be able to load all of the integers into the SSE module of your processor using some magic assembly code, then do the equivalent code to set the values to ints, then read them as floats. I'm not sure this would be any faster though. I'm not a SSE guru, so I don't know how to do this. Maybe someone else can chime in.

0 讨论(0)
发布评论:

提交评论
- 加载中...
星月不相逢

2020-12-04 18:42

What compiler are you using? In Microsoft's more recent C/C++ compilers, there is an option under C/C++ -> Code Generation -> Floating point model, which has options: fast, precise, strict. I think precise is the default, and works by emulating FP operations to some extent. If you are using a MS compiler, how is this option set? Does it help to set it to "fast"? In any case, what does the disassembly look like?

As thirtyseven said above, the CPU can convert float<->int in essentially one instruction, and it doesn't get any faster than that (short of a SIMD operation).

Also note that modern CPUs use the same FP unit for both single (32 bit) and double (64 bit) FP numbers, so unless you are trying to save memory storing a lot of floats, there's really no reason to favor float over double.

0 讨论(0)
发布评论:

提交评论
- 加载中...
被撕碎了的回忆

2020-12-04 18:43

In Visual C++ 2008, the compiler generates SSE2 calls by itself, if you do a release build with maxed out optimization options, and look at a disassembly (though some conditions have to be met, play around with your code).

0 讨论(0)
发布评论:

提交评论
- 加载中...
清酒与你

2020-12-04 18:48

Is the time large enough that it outweighs the cost of starting a couple of threads?

Assuming you have a multi-core processor or multiple processors on your box that you could take advantage of, this would be a trivial task to parallelize across multiple threads.

0 讨论(0)
发布评论:

提交评论
- 加载中...
温柔的废话

2020-12-04 18:50

On Intel your best bet is inline SSE2 calls.

0 讨论(0)
发布评论:

提交评论
- 加载中...
温柔的废话

2020-12-04 18:53

most c compilers generate calls to _ftol or something for every float to int conversion. putting a reduced floating point conformance switch (like fp:fast) might help - IF you understand AND accept the other effects of this switch. other than that, put the thing in a tight assembly or sse intrinsic loop, IF you are ok AND understand the different rounding behavior. for large loops like your example you should write a function that sets up floating point control words once and then does the bulk rounding with only fistp instructions and then resets the control word - IF you are ok with an x86 only code path, but at least you will not change the rounding. read up on the fld and fistp fpu instructions and the fpu control word.

0 讨论(0)
发布评论:

提交评论
- 加载中...

1 2 3 下一页

热议问题