Setting __m256i to the value of two __m128i values

后端 未结 2 1164
逝去的感伤
逝去的感伤 2021-01-05 07:46

So, AVX has a function from immintrin.h, which should allow to store the concatenation of two __m128i values into a single __m256i val

相关标签:
2条回答
  • 2021-01-05 08:24

    We had the same problem and used a macro to work around it.

    #ifdef __GNUC__ 
    #if __GNUC__ < 8
    #define _mm256_set_m128i(xmm1, xmm2) _mm256_permute2f128_si256(_mm256_castsi128_si256(xmm1), _mm256_castsi128_si256(xmm2), 2)
    #define _mm256_set_m128f(xmm1, xmm2) _mm256_permute2f128_ps(_mm256_castps128_ps256(xmm1), _mm256_castps128_ps256(xmm2), 2)
    #endif
    #endif
    
    0 讨论(0)
  • 2021-01-05 08:34

    Not all compilers seem to have _mm256_setr_m128i, or even _mm256_set_m128i, defined in immintrin.h. So I usually just define macros as needed, bracketed with suitable #ifdefs which test for compiler and version:

    #define _mm256_set_m128i(v0, v1)  _mm256_insertf128_si256(_mm256_castsi128_si256(v1), (v0), 1)
    
    #define _mm256_setr_m128i(v0, v1) _mm256_set_m128i((v1), (v0))
    
    • Intel ICC 11.1 and later has both _mm256_set_m128i and _mm256_setr_m128i.

    • MSVC 2012 and later has just _mm256_set_m128i.

    • gcc/clang don't seem to have either, although I haven't checked recent versions to see if this has been fixed yet.

    0 讨论(0)
提交回复
热议问题