Detect FPU rounding mode on a GPU

余生长醉 提交于 2019-12-04 20:14:34

This C code tells you that you are either in round-to-nearest-even or using a strange floating-point architecture indeed:

int is_round_to_nearest(void)
{
  if ( 1.0 + 0x1.0p-100 !=  1.0) return 0;
  if ( 1.0 - 0x1.0p-100 !=  1.0) return 0;
  if (-1.0 + 0x1.0p-100 != -1.0) return 0;
  if (-1.0 - 0x1.0p-100 != -1.0) return 0;
  return 1;
}

You can add an f suffix to all twelve floating-point constants above to obtain a single-precision function.

I ended up developing a slightly modified routine, which tests how the ties are handled, instead of which rounding mode is in place, as in case it is round to nearest (as detected correctly by code of Pascal Cuoq), the tie-breaking can still be ties away from zero - but usually will not be, at least on x86 machines.

The code to detect ties to nearest even is:

int b_TieBreak_ToEven()
{
    //                                      <- 16B double ->
    //                                         <- fraction->
    const double special = f_ParseXDouble("0x0.00000000000008p+0"); // one, at the position one past the LSB
    const double oddone =  f_ParseXDouble("0x1.0000000000001p+0"); // one, ending with a single one at LSB
    const double evenone = f_ParseXDouble("0x1.0000000000002p+0"); // one, ending with a single one to the left of LSB

    volatile double v;
    v = 1.0; v += special;
    if(v != 1.0)
        return 0;
    v = oddone; v += special;
    if(v != evenone) // odd + half rounds to even
        return 0;
    v = evenone; v += special;
    if(v != evenone) // even + half rounds to the same even
        return 0;
    v = -1.0; v -= special;
    if(v != -1.0)
        return 0;
    v = -oddone; v -= special;
    if(v != -evenone) // -odd - half rounds to -even
        return 0;
    v = -evenone; v -= special;
    if(v != -evenone) // -even - half rounds to the same -even
        return 0;

    return 1;
}

I tested this on Windows, on Linux, on BSD and on Raspbian, it seems to work pretty nice. It contains a small routine that parses doubles and floats in the hexadecimal format (the f_ParseXDouble). You can download the source codes here.

On my AMD-based windows machine, this says:

all unit tests passed
default : mode : to nearest, ties to even (ties to even: 1)
_RC_CHOP : mode : towards zero (ties to even: 0)
_RC_UP : mode : towards positive infinity (ties to even: 0)
_RC_DOWN : mode : towards negative infinity (ties to even: 0)
_RC_NEAR : mode : to nearest, ties to even (ties to even: 1)

On Raspberry PI:

all unit tests passed
default : mode : to nearest, ties to even (ties to even: 1)

On NVIDIA GPUs (480, 680 and 780, in OpenCL):

OpenCL platform 'NVIDIA CUDA' by NVIDIA Corporation, version OpenCL 1.1 CUDA 6.0.1, FULL_PROFILE
device: NVIDIA Corporation 'GeForce GTX 680' (driver version: 331.65)
        OpenCL version: OpenCL 1.1 CUDA
        OpenCL "C" version: OpenCL C 1.1

GPU mode: round to nearest, ties to even
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!