SSE/AVX: Choose from two __m256 float vectors based on per-element min and max absolute value
问题 I am looking for efficient AVX (AVX512) implementation of // Given float u[8]; float v[8]; // Compute float a[8]; float b[8]; // Such that for ( int i = 0; i < 8; ++i ) { a[i] = fabs(u[i]) >= fabs(v[i]) ? u[i] : v[i]; b[i] = fabs(u[i]) < fabs(v[i]) ? u[i] : v[i]; } I.e., I need to select element-wise into a from u and v based on mask , and into b based on !mask , where mask = (fabs(u) >= fabs(v)) element-wise. 回答1: I had this exact same problem just the other day. The solution I came up with