I\'m playing around with SIMD and wonder why there is no analogon to _mm_cvtsd_f64 to extrat the higher order floating point from a __m128d.
GCC 4.6+ has an extensio
I suggest that you use the following code:
inline static _mm_cvtsd_f64_h(__m128d x) {
return _mm_cvtsd_f64(_mm_unpackhi_pd(x, x));
}
This is likely the fastest way to get get the upper half of xmm
register, and it is compatible with MSVC/icc/gcc/clang.
You can just use a union:
union {
__m128d v;
double a[2];
} U;
Assign your __m128d to U.v and read back U.a[0] or U.a[1]. Any decent compiler will optimise away redundant stores and loads.