rint not present in Visual Studio 2010 math.h and equivalent of CUDA rint

后端未结

关注

 1  2084

I\'m porting a CUDA code to C++ and using Visual Studio 2010. The CUDA code uses the rint function, which does not seem to be present in the Visual Studio 2010

相关标签:

1条回答

梦如初夏

2021-01-14 05:24
The cited description of rint() in the CUDA documentation is incorrect. Roundings to integer with floating-point result map the IEEE-754 (2008) specified rounding modes as follows:
```
trunc()   // round towards zero
floor()   // round down (towards negative infinity)
ceil()    // round up (towards positive infinity)
rint()    // round to nearest or even (i.e. ties are rounded to even)
round()   // round to nearest, ties away from zero
```
Generally, these functions work as described in the C99 standard. For rint(), the standard specifies that the function rounds according to the current rounding mode (which defaults to round to nearest or even). Since CUDA does not support dynamic rounding modes, all functions that are defined to use the current rounding mode use the rounding mode "round to nearest or even". Here are some examples showing the difference between round() and rint():
```
argument  rint()  round()
1.5       2.0     2.0
2.5       2.0     3.0
3.5       4.0     4.0
4.5       4.0     5.0
```
round() can be emulated fairly easily along the lines of the code you posted, I am not aware of a simple emulation for rint(). Please note that you would not want to use an intermediate cast to integer, as 'int' supports a narrower numeric range than integers that are exactly representable by a 'double'. Instead use trunc(), ceil(), floor() as appropriate.

Since rint() is part of both the current C and C++ standards, I am a bit surprised that MSVC does not include this function; I would suggest checking MSDN to see whether a substitute is offered. If your platforms are SSE4 capable, you could use the SSE intrinsics _mm_round_sd(), _mm_round_pd() defined in smmintrin.h, with the rounding mode set to _MM_FROUND_TO_NEAREST_INT, to implement the functionality of CUDA's rint().

While (in my experience), the SSE intrinsics are portable across Windows, Linux, and Mac OS X, you may want to avoid hardware specific code. In this case, you could try the following code (lightly tested):
```
double my_rint(double a)
{
    const double two_to_52 = 4.5035996273704960e+15;
    double fa = fabs(a);
    double r = two_to_52 + fa;
    if (fa >= two_to_52) {
        r = a;
    } else {
        r = r - two_to_52;
        r = _copysign(r, a);
    }
    return r;
}
```
Note that MSVC 2010 seems to lack the standard copysign() function as well, so I had to substitute _copysign(). The above code assumes that the current rounding mode is round-to-nearest-even (which it is by default). By adding 2**52 it makes sure that rounding occurs at the integer unit bit. Note that this also assumes that pure double-precision computation is performed. On platforms that use some higher precision for intermediate results one might need to declare 'fa' and 'r' as volatile.
0 讨论(0)
发布评论:

提交评论
- 加载中...