I have an image drawing routine which is compiled multiple times for SSE, SSE2, SSE3, SSE4.1, SSE4.2, AVX and AVX2. My program dynamically dispatches one of these binary var
For detecting instruction set features there are two source files I reference:
Both of these files will tell you how to detect SSE through AVX2 as well as XOP, FMA3, FMA4, if your OS supports AVX and other features.
I am used to Agner's code (one source file for MSVC, GCC, Clang, ICC) so let's look at that first.
Here are the relevant code fragments from instrset_detect.cpp
for detecting AVX:
iset = 0; // default value
int abcd[4] = {0,0,0,0}; // cpuid results
cpuid(abcd, 0); // call cpuid function 0
//....
iset = 6; // 6: SSE4.2 supported
if ((abcd[2] & (1 << 27)) == 0) return iset; // no OSXSAVE
if ((xgetbv(0) & 6) != 6) return iset; // AVX not enabled in O.S.
if ((abcd[2] & (1 << 28)) == 0) return iset; // no AVX
iset = 7; // 7: AVX supported
with xgetbv
defined as
// Define interface to xgetbv instruction
static inline int64_t xgetbv (int ctr) {
#if (defined (_MSC_FULL_VER) && _MSC_FULL_VER >= 160040000) || (defined (__INTEL_COMPILER) && __INTEL_COMPILER >= 1200) // Microsoft or Intel compiler supporting _xgetbv intrinsic
return _xgetbv(ctr); // intrinsic function for XGETBV
#elif defined(__GNUC__) // use inline assembly, Gnu/AT&T syntax
uint32_t a, d;
__asm("xgetbv" : "=a"(a),"=d"(d) : "c"(ctr) : );
return a | (uint64_t(d) << 32);
#else // #elif defined (_WIN32) // other compiler. try inline assembly with masm/intel/MS syntax
//see the source file
}
I did not include the cpuid
function (see the source code) and I removed the non GCC inline assembly from xgetbv
to make the answer shorter.
Here is the detect_OS_AVX()
from Mysticial's cpu_x86.cpp
for detecting AVX:
bool cpu_x86::detect_OS_AVX(){
// Copied from: http://stackoverflow.com/a/22521619/922184
bool avxSupported = false;
int cpuInfo[4];
cpuid(cpuInfo, 1);
bool osUsesXSAVE_XRSTORE = (cpuInfo[2] & (1 << 27)) != 0;
bool cpuAVXSuport = (cpuInfo[2] & (1 << 28)) != 0;
if (osUsesXSAVE_XRSTORE && cpuAVXSuport)
{
uint64_t xcrFeatureMask = xgetbv(_XCR_XFEATURE_ENABLED_MASK);
avxSupported = (xcrFeatureMask & 0x6) == 0x6;
}
return avxSupported;
}
Mystical apparently came up with this solution from this answer.
Notice that both source files do basically the same thing: check the OSXSAVE bit 27, check the AVX bit 28 from CPUID, check a result from xgetbv
.
For AVX the answer is quite straightforward:
You need at least OS X 10.6.7
Please note that only build 10J3250 and 10J4138 would support it.
For AVX2 that would be 10.8.4 build 12E3067 or 12E4022