[There are a few questions on this but none of the answers are particularly definitive and several are out of date with the current C++ standard].
My research shows
Use modf()
which breaks the value into integral and fractional parts. From this direct test, it is known if the double
is a whole number or not. After this, limit tests against the min/max of the target integer type can be done.
#include <cmath>
bool IsInteger(double x) {
double ipart;
return std::modf(x, &ipart) == 0.0; // Test if fraction is 0.0.
}
Note modf()
differs from the similar named fmod()
.
Of the 3 methods OP posted, the cast to/from an integer may perform a fair amount of work doing the casts and compare. The other 2 are marginally the same. They work, assuming no unexpected rounding mode effects from dividing by 1.0. But do an unnecessary divide.
As to which is fastest likely depends on the mix of double
s used.
OP's first method has a singular advantage: Since the goal is to test if a FP may convert exactly to a some integer, and likely then if the result is true, the conversion needs to then occur, OP's first method has already done the conversion.
The problem with:
if ( f >= std::numeric_limits<T>::min()
&& f <= std::numeric_limits<T>::max()
&& f == (T)f))
is that if T is (for example) 64 bits, then the max will be rounded when converting to your usual 64 bit double :-( Assuming 2's complement, the same is not true of the min, of course.
So, depending on the number of bits in the mantisaa, and the number of bits in T, you need to mask off the LS bits of std::numeric_limits::max()... I'm sorry, I don't do C++, so how best to do that I leave to others. [In C it would be something along the lines of LLONG_MAX ^ (LLONG_MAX >> DBL_MANT_DIG)
-- assuming T is long long int
and f is double
and that these are both the usual 64 bit values.]
If the T is constant, then the construction of the two floating point values for min and max will (I assume) be done at compile time, so the two comparisons are pretty straightforward. You don't really need to be able to float T... but you do need to know that its min and max will fit in an ordinary integer (long long int, say).
The remaining work is converting the float to integer, and then floating that back up again for the final comparison. So, assuming f is in range (which guarantees (T)f does not overflow):
i = (T)f ; // or i = (long long int)f ;
ok = (i == f) ;
The alternative seems to be:
i = (T)f ; // or i = (long long int)f ;
ok = (floor(f) == f) ;
as noted elsewhere. Which replaces the floating of i
by floor(f)
... which I'm not convinced is an improvement.
If f is NaN things may go wrong, so you might want to test for that too.
You could try unpacking f
with frexp()
and extract the mantissa as (say) a long long int (with ldexp()
and a cast), but when I started to sketch that out it looked ugly :-(
Having slept on it, a simpler way of dealing with the max issue is to do: min <= f < ((unsigned)max+1)
-- or min <= f < (unsigned)min
-- or (double)min <= f < -(double)min
-- or any other method of constructing -2^(n-1) and +2^(n-1) as floating point values, where n is the number of bits in T.
(Serves me right for getting interested in a problem at 1:00am !)
This test is good:
if ( f >= std::numeric_limits<T>::min()
&& f <= std::numeric_limits<T>::max()
&& f == (T)f))
These tests are incomplete:
using std::fmod to extract the remainder and test equality to 0.
using std::remainder and test equality to 0.
They both fail to check that the conversion to T
is defined. Float-to-integral conversions that overflow the integral type result in undefined behaviour, which is even worse than roundoff.
I would recommend avoiding std::fmod
for another reason. This code:
int isinteger(double d) {
return std::numeric_limits<int>::min() <= d
&& d <= std::numeric_limits<int>::max()
&& std::fmod(d, 1.0) == 0;
}
compiles (gcc version 4.9.1 20140903 (prerelease) (GCC) on x86_64 Arch Linux using -g -O3 -std=gnu++0x) to this:
0000000000400800 <_Z9isintegerd>:
400800: 66 0f 2e 05 10 01 00 ucomisd 0x110(%rip),%xmm0 # 400918 <_IO_stdin_used+0x18>
400807: 00
400808: 72 56 jb 400860 <_Z9isintegerd+0x60>
40080a: f2 0f 10 0d 0e 01 00 movsd 0x10e(%rip),%xmm1 # 400920 <_IO_stdin_used+0x20>
400811: 00
400812: 66 0f 2e c8 ucomisd %xmm0,%xmm1
400816: 72 48 jb 400860 <_Z9isintegerd+0x60>
400818: 48 83 ec 18 sub $0x18,%rsp
40081c: d9 e8 fld1
40081e: f2 0f 11 04 24 movsd %xmm0,(%rsp)
400823: dd 04 24 fldl (%rsp)
400826: d9 f8 fprem
400828: df e0 fnstsw %ax
40082a: f6 c4 04 test $0x4,%ah
40082d: 75 f7 jne 400826 <_Z9isintegerd+0x26>
40082f: dd d9 fstp %st(1)
400831: dd 5c 24 08 fstpl 0x8(%rsp)
400835: f2 0f 10 4c 24 08 movsd 0x8(%rsp),%xmm1
40083b: 66 0f 2e c9 ucomisd %xmm1,%xmm1
40083f: 7a 22 jp 400863 <_Z9isintegerd+0x63>
400841: 66 0f ef c0 pxor %xmm0,%xmm0
400845: 31 c0 xor %eax,%eax
400847: ba 00 00 00 00 mov $0x0,%edx
40084c: 66 0f 2e c8 ucomisd %xmm0,%xmm1
400850: 0f 9b c0 setnp %al
400853: 0f 45 c2 cmovne %edx,%eax
400856: 48 83 c4 18 add $0x18,%rsp
40085a: c3 retq
40085b: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
400860: 31 c0 xor %eax,%eax
400862: c3 retq
400863: f2 0f 10 0d bd 00 00 movsd 0xbd(%rip),%xmm1 # 400928 <_IO_stdin_used+0x28>
40086a: 00
40086b: e8 20 fd ff ff callq 400590 <fmod@plt>
400870: 66 0f 28 c8 movapd %xmm0,%xmm1
400874: eb cb jmp 400841 <_Z9isintegerd+0x41>
400876: 66 2e 0f 1f 84 00 00 nopw %cs:0x0(%rax,%rax,1)
40087d: 00 00 00
The first five instructions implement the range check against std::numeric_limits<int>::min()
and std::numeric_limits<int>::max()
. The rest is the fmod
test, accounting for all the misbehaviour of a single invocation of the fprem
instruction (400828..40082d) and some case where a NaN somehow arose.
You get similar code by using remainder
.
The answer is use std::trunc(f) == f
the time difference is insignificant when comparing all these methods. Even if the specific IEEE unwinding code we write in the example below is technically twice is fast we are only talking about 1 nano second faster.
The maintenance costs in the long run though would be significantly higher. So use a solution that is easier to read and understand by the maintainer is better.
Time in microseconds to complete 12,000,000 operations on a random set of numbers:
std::trunc(f) == f
32std::floor(val) - val == 0
35((uint64_t)f) - f) == 0.0
38std::fmod(val, 1.0) == 0
87A floating point number is two parts:
mantissa: The data part of the value.
exponent: a power to multiply it by.
such that:
value = mantissa * (2^exponent)
So the exponent is basically how many binary digits we are going to shift the "binary point" down the mantissa. A positive value shifts it right a negative value shifts it left. If all the digits to the right of the binary point are zero then we have an integer.
If we assume IEEE 754
We should note that this representation the value is normalized so that the most significant bit in the mantissa is shifted to be 1. Since this bit is always set it is not actually stored (the processor knows its there and compensates accordingly).
So:
If the exponent < 0
then you definitely do not have an integer as it can only be representing a fractional value. If the exponent >= <Number of bits In Mantissa>
then there is definately no fractual part and it is an integer (though you may not be able to hold it in an int
).
Otherwise we have to do some work. if the exponent >= 0 && exponent < <Number of bits In Mantissa>
then you may be representing an integer if the mantissa
is all zero in the bottom half (defined below).
Additional as part of the normalization 127 is added to the exponent (so that there are no negative values stored in the 8 bit exponent field).
#include <limits>
#include <iostream>
#include <cmath>
/*
* Bit 31 Sign
* Bits 30-23 Exponent
* Bits 22-00 Mantissa
*/
bool is_IEEE754_32BitFloat_AnInt(float val)
{
// Put the value in an int so we can do bitwise operations.
int valAsInt = *reinterpret_cast<int*>(&val);
// Remember to subtract 127 from the exponent (to get real value)
int exponent = ((valAsInt >> 23) & 0xFF) - 127;
int bitsInFraction = 23 - exponent;
int mask = exponent < 0
? 0x7FFFFFFF
: exponent > 23
? 0x00
: (1 << bitsInFraction) - 1;
return !(valAsInt & mask);
}
/*
* Bit 63 Sign
* Bits 62-52 Exponent
* Bits 51-00 Mantissa
*/
bool is_IEEE754_64BitFloat_AnInt(double val)
{
// Put the value in an long long so we can do bitwise operations.
uint64_t valAsInt = *reinterpret_cast<uint64_t*>(&val);
// Remember to subtract 1023 from the exponent (to get real value)
int exponent = ((valAsInt >> 52) & 0x7FF) - 1023;
int bitsInFraction = 52 - exponent;
uint64_t mask = exponent < 0
? 0x7FFFFFFFFFFFFFFFLL
: exponent > 52
? 0x00
: (1LL << bitsInFraction) - 1;
return !(valAsInt & mask);
}
bool is_Trunc_32BitFloat_AnInt(float val)
{
return (std::trunc(val) - val == 0.0F);
}
bool is_Trunc_64BitFloat_AnInt(double val)
{
return (std::trunc(val) - val == 0.0);
}
bool is_IntCast_64BitFloat_AnInt(double val)
{
return (uint64_t(val) - val == 0.0);
}
template<typename T, bool isIEEE = std::numeric_limits<T>::is_iec559>
bool isInt(T f);
template<>
bool isInt<float, true>(float f) {return is_IEEE754_32BitFloat_AnInt(f);}
template<>
bool isInt<double, true>(double f) {return is_IEEE754_64BitFloat_AnInt(f);}
template<>
bool isInt<float, false>(float f) {return is_Trunc_64BitFloat_AnInt(f);}
template<>
bool isInt<double, false>(double f) {return is_Trunc_64BitFloat_AnInt(f);}
int main()
{
double x = 16;
std::cout << x << "=> " << isInt(x) << "\n";
x = 16.4;
std::cout << x << "=> " << isInt(x) << "\n";
x = 123.0;
std::cout << x << "=> " << isInt(x) << "\n";
x = 0.0;
std::cout << x << "=> " << isInt(x) << "\n";
x = 2.0;
std::cout << x << "=> " << isInt(x) << "\n";
x = 4.0;
std::cout << x << "=> " << isInt(x) << "\n";
x = 5.0;
std::cout << x << "=> " << isInt(x) << "\n";
x = 1.0;
std::cout << x << "=> " << isInt(x) << "\n";
}
Results:
> ./a.out
16=> 1
16.4=> 0
123=> 1
0=> 1
2=> 1
4=> 1
5=> 1
1=> 1
Test data was generated like this:
(for a in {1..3000000};do echo $RANDOM.$RANDOM;done ) > test.data
(for a in {1..3000000};do echo $RANDOM;done ) >> test.data
(for a in {1..3000000};do echo $RANDOM$RANDOM0000;done ) >> test.data
(for a in {1..3000000};do echo 0.$RANDOM;done ) >> test.data
Modified main() to run tests:
int main()
{
// ORIGINAL CODE still here.
// Added this trivial speed test.
std::ifstream testData("test.data"); // Generated a million random numbers
std::vector<double> test{std::istream_iterator<double>(testData), std::istream_iterator<double>()};
std::cout << "Data Size: " << test.size() << "\n";
int count1 = 0;
int count2 = 0;
int count3 = 0;
auto start = std::chrono::system_clock::now();
for(auto const& v: test)
{ count1 += is_IEEE754_64BitFloat_AnInt(v);
}
auto p1 = std::chrono::system_clock::now();
for(auto const& v: test)
{ count2 += is_Trunc_64BitFloat_AnInt(v);
}
auto p2 = std::chrono::system_clock::now();
for(auto const& v: test)
{ count3 += is_IntCast_64BitFloat_AnInt(v);
}
auto end = std::chrono::system_clock::now();
std::cout << "IEEE " << count1 << " Time: " << std::chrono::duration_cast<std::chrono::milliseconds>(p1 - start).count() << "\n";
std::cout << "Trunc " << count2 << " Time: " << std::chrono::duration_cast<std::chrono::milliseconds>(p2 - p1).count() << "\n";
std::cout << "Int Cast " << count3 << " Time: " << std::chrono::duration_cast<std::chrono::milliseconds>(end - p2).count() << "\n"; }
The tests show:
> ./a.out
16=> 1
16.4=> 0
123=> 1
0=> 1
2=> 1
4=> 1
5=> 1
1=> 1
Data Size: 12000000
IEEE 6000199 Time: 18
Trunc 6000199 Time: 32
Int Cast 6000199 Time: 38
The IEEE code (in this simple test) seem to beat the truncate method and generate the same result. BUT the amount of time is insignificant. Over 12 million calls we saw a difference in 14 milliseconds.
Use std::fmod(f, 1.0) == 0.0
where f
is either a float
, double
, or long double
. If you're worried about spurious effects of unwanted floating point promotions when using float
s, then use either 1.0f
or the more comprehensive
std::fmod(f, static_cast<decltype(f)>(1.0)) == 0.0
which will force, obviously at compile time, the correct overload to be called. The return value of std::fmod(f, ...)
will be in the range [0, 1) and it's perfectly safe to compare to 0.0
to complete your integer check.
If it turns out that f
is an integer, then make sure it's within the permitted range of your chosen type before attempting a cast: else you risk invoking undefined behaviour. I see that you're already familiar with std::numeric_limits
which can help you here.
My reservations against using std::remainder
are possibly (i) my being a Luddite and (ii) it not being available in some compilers partially implementing the C++11 standard, such as MSVC12. I don't like solutions involving casts since the notation hides that reasonably expensive operation and you need to check in advance for safety. If you must adopt your first choice, at least replace the C-style cast with static_cast<T>(f)
;
If your question is "Can I convert this double to int without loss of information?" then I would do something simple like :
template <typename T, typename U>
bool CanConvert(U u)
{
return U(T(u)) == u;
}
CanConvert<int>(1.0) -- true
CanConvert<int>(1.5) -- false
CanConvert<int>(1e9) -- true
CanConvert<int>(1e10)-- false