GNU C native vectors: how to broadcast a scalar, like x86's _mm_set1_epi16

后端 未结 1 1473
灰色年华
灰色年华 2020-12-07 02:20

How do I write a portable GNU C builtin vectors version of this, which doesn\'t depend on the x86 set1 intrinsic?

typedef uint16_t v8su __attribute__((vector         


        
相关标签:
1条回答
  • 2020-12-07 03:03

    A generic broadcast solution can be found for GCC and Clang using two observations

    1. Clang's OpenCL vector extensions and GCC's vector extensions support scalar - vector operations.
    2. x - 0 = x (but x + 0 does not work due to signed zero).

    Here is a solution for a vector of four floats.

    #if defined (__clang__)
    typedef float v4sf __attribute__((ext_vector_type(4)));
    #else
    typedef float v4sf __attribute__ ((vector_size (16)));
    #endif
    
    v4sf broadcast4f(float x) {
      return x - (v4sf){};
    }
    

    https://godbolt.org/g/PXr3Xb

    The same generic solution can be used for different vectors. Here is an example for a vector of eight unsigned shorts.

    #if defined (__clang__)
    typedef unsigned short v8su __attribute__((ext_vector_type(8)));
    #else
    typedef unsigned short v8su __attribute__((vector_size(16)));
    #endif
    
    v8su broadcast8us(short x) {
      return x - (v8su){};
    }
    

    ICC (17) supports a subset of the GCC vector extensions but does not support either vector + scalar or vector*scalar yet so intrinsics are still necessary for broadcasts. MSVC does not support any vector extensions.

    0 讨论(0)
提交回复
热议问题