I have production code that has kernels implemented for various SIMD instruction sets, including AVX, AVX2, and AVX512. The code can be compiled on the target machine for the ta