I\'d like to enable temporarily FTZ
/DAZ
modes to get a performance gain for some code where strict compliance with the IEEE 754 standard is not an
Yes, MXCSR
is part of the per-thread architectural state saved/restored by context switches, along with the xmm/ymm/zmm and x87 stack registers (using xsave
/xrstor
). Different threads have their own FPU state.
Interesting idea, I'd always figured DAZ was only useful if you had denormal constants or something (or data from a file), but having other threads running without FTZ is another source of denormals.
You might also want to compile some files with -ffast-math
, or a subset of those options. Note that linking with -ffast-math
in gcc will include a CRT function that sets DAZ/FTZ before main()
, so don't do that.
The optimizations enabled by fast-math are mostly orthogonal to whether denormals are flushed to zero. Even just -fno-math-errno
lets more math functions inline (better / at-all), e.g. sqrtf
, and is totally safe if you don't care about errno
being set as well as getting a NaN result.