Get member of __m128 by index?

后端 未结 4 737
旧时难觅i
旧时难觅i 2020-12-05 10:28

I\'ve got some code, originally given to me by someone working with MSVC, and I\'m trying to get it to work on Clang. Here\'s the function that I\'m having trouble with:

相关标签:
4条回答
  • A union is probably the most portable way to do this:

    union {
        __m128 v;    // SSE 4 x float vector
        float a[4];  // scalar array of 4 floats
    } U;
    
    float vectorGetByIndex(__m128 V, unsigned int i)
    {
        U u;
    
        assert(i <= 3);
        u.v = V;
        return u.a[i];
    }
    
    0 讨论(0)
  • 2020-12-05 10:56

    As a modification to hirschhornsalz's solution, if i is a compile-time constant, you could avoid the union path entirely by using a shuffle/store:

    template<unsigned i>
    float vectorGetByIndex( __m128 V)
    {
    #ifdef __SSE4_1__
        return _mm_extract_epi32(V, i);
    #else
        float ret;
        // shuffle V so that the element that you want is moved to the least-
        // significant element of the vector (V[0])
        V = _mm_shuffle_ps(V, V, _MM_SHUFFLE(i, i, i, i));
        // return the value in V[0]
        return _mm_cvtss_f32(V);
    #endif
    }
    
    0 讨论(0)
  • 2020-12-05 10:58

    Use

    template<unsigned i>
    float vectorGetByIndex( __m128 V) {
        union {
            __m128 v;    
            float a[4];  
        } converter;
        converter.v = V;
        return converter.a[i];
    }
    

    which will work regardless of the available instruction set.

    Note: Even if SSE4.1 is available and i is a compile time constant, you can't use pextract etc. this way, because these instructions extract a 32-bit integer, not a float:

    // broken code starts here
    template<unsigned i>
    float vectorGetByIndex( __m128 V) {
        return _mm_extract_epi32(V, i);
    }
    // broken code ends here
    

    I don't delete it because it is a useful reminder how to not do things.

    0 讨论(0)
  • 2020-12-05 10:59

    The way I use is

    union vec { __m128 sse, float f[4] };
    
    float accessmember(__m128 v, int index)
    {
        vec v.sse = v;
        return v.f[index];
    }
    

    Seems to work out pretty well for me.

    0 讨论(0)
提交回复
热议问题