What is the fastest way you know to convert a floating-point number to an int on an x86 CPU. Preferrably in C or assembly (that can be in-lined in C) for any combination of
A commonly used trick for plain x86/x87 code is to force the mantissa part of the float to represent the int. 32 bit version follows.
The 64-bit version is analogical. The Lua version posted above is faster, but relies on the truncation of double to a 32-bit result, therefore it requires the x87 unit to be set to double precision, and cannot be adapted for double to 64-bit int conversion.
The nice thing about this code is it is completely portable for all platforms conforming to IEEE 754, the only assumption made is the floating point rounding mode is set to nearest. Note: Portable in the sense it compiles and works. Platforms other than x86 usually do not benefit much from this technique, if at all.
static const float Snapper=3<<22;
union UFloatInt {
int i;
float f;
};
/** by Vlad Kaipetsky
portable assuming FP24 set to nearest rounding mode
efficient on x86 platform
*/
inline int toInt( float fval )
{
Assert( fabs(fval)<=0x003fffff ); // only 23 bit values handled
UFloatInt &fi = *(UFloatInt *)&fval;
fi.f += Snapper;
return ( (fi.i)&0x007fffff ) - 0x00400000;
}
Generally, you can trust the compiler to be efficient and correct. There is usually nothing to be gained by rolling your own functions for something that already exists in the compiler.
The Lua code base has the following snippet to do this (check in src/luaconf.h from www.lua.org). If you find (SO finds) a faster way, I'm sure they'd be thrilled.
Oh, lua_Number
means double. :)
/*
@@ lua_number2int is a macro to convert lua_Number to int.
@@ lua_number2integer is a macro to convert lua_Number to lua_Integer.
** CHANGE them if you know a faster way to convert a lua_Number to
** int (with any rounding method and without throwing errors) in your
** system. In Pentium machines, a naive typecast from double to int
** in C is extremely slow, so any alternative is worth trying.
*/
/* On a Pentium, resort to a trick */
#if defined(LUA_NUMBER_DOUBLE) && !defined(LUA_ANSI) && !defined(__SSE2__) && \
(defined(__i386) || defined (_M_IX86) || defined(__i386__))
/* On a Microsoft compiler, use assembler */
#if defined(_MSC_VER)
#define lua_number2int(i,d) __asm fld d __asm fistp i
#define lua_number2integer(i,n) lua_number2int(i, n)
/* the next trick should work on any Pentium, but sometimes clashes
with a DirectX idiosyncrasy */
#else
union luai_Cast { double l_d; long l_l; };
#define lua_number2int(i,d) \
{ volatile union luai_Cast u; u.l_d = (d) + 6755399441055744.0; (i) = u.l_l; }
#define lua_number2integer(i,n) lua_number2int(i, n)
#endif
/* this option always works, but may be slow */
#else
#define lua_number2int(i,d) ((i)=(int)(d))
#define lua_number2integer(i,d) ((i)=(lua_Integer)(d))
#endif
If you can guarantee the CPU running your code is SSE3 compatible (even Pentium 5 is, JBB), you can allow the compiler to use its FISTTP instruction (i.e. -msse3 for gcc). It seems to do the thing like it should always have been done:
http://software.intel.com/en-us/articles/how-to-implement-the-fisttp-streaming-simd-extensions-3-instruction/
Note that FISTTP is different from FISTP (that has its problems, causing the slowness). It comes as part of SSE3 but is actually (the only) X87-side refinement.
Other then X86 CPU's would probably do the conversion just fine, anyways. :)
Processors with SSE3 support