Summing 8-bit integers in __m512i with AVX intrinsics
问题 AVX512 provide us with intrinsics to sum all cells in a __mm512 vector. However, some of their counterparts are missing: there is no _mm512_reduce_add_epi8 , yet. _mm512_reduce_add_ps //horizontal sum of 16 floats _mm512_reduce_add_pd //horizontal sum of 8 doubles _mm512_reduce_add_epi32 //horizontal sum of 16 32-bit integers _mm512_reduce_add_epi64 //horizontal sum of 8 64-bit integers Basically, I need to implement MAGIC in the following snippet. __m512i all_ones = _mm512_set1_epi16(1);