Hi all :)
I\'m trying to get a hang on a few concepts regarding floating point, SIMD/math intrinsics and the fast-math flag for gcc. More specifically, I\'m using MinGW with
Ok, I'm ansewring for anyone who is struggling a bit to grasp these concepts like me.
Optimizations with Ox work on any kind of code, fpu or sse
fast-math seems to work only on x87 code. Also, it doesn't seem to change the fpu control word o_O
Builtins are always included. This behavior can be avoided for some builtins, with some flags, like strict or no-builtins.
The libm.a is used for some stuff that is not included in the glibc, but with mingw it's just a dummy file, so at the moment it's useless to link to it
Using the special vector types of gcc seems useful only when calling the intrinsics directly, otherwise the code gets vectorized anyway.
Any correction is welcomed :)
Useful links:
fpu / sse control
gcc math
and the gcc manual on "Vector Extensions", "X86 Built-in functions" and "Other Builtins"