Comparing 2 vectors in AVX/AVX2 (c)

筅森魡賤 提交于 2021-01-20 07:11:50

问题


I have two __m256i vectors (each containing chars), and I want to find out if they are completely identical or not. All I need is true if all bits are equal, and 0 otherwise.

What's the most efficient way of doing that? Here's the code loading the arrays:

char * a1 = "abcdefhgabcdefhgabcdefhgabcdefhg";
__m256i r1 = _mm256_load_si256((__m256i *) a1);

char * a2 = "abcdefhgabcdefhgabcdefhgabcdefhg";
__m256i r2 = _mm256_load_si256((__m256i *) a2);

回答1:


The most efficient way on current Intel and AMD CPUs is an element-wise comparison for equality, and then check that the comparison was true for all elements.

This compiles to multiple instructions, but they're all cheap and (if you branch on the result) the compare+branch even macro-fuses into a single uop.

#include <immintrin.h>
#include <stdbool.h>

bool vec_equal(__m256i a, __m256i b) {
    __m256i pcmp = _mm256_cmpeq_epi32(a, b);  // epi8 is fine too
    unsigned bitmask = _mm256_movemask_epi8(pcmp);
    return (bitmask == 0xffffffffU);
}

The resulting asm should be vpcmpeqd / vpmovmskb / cmp 0xffffffff / je, which is only 3 uops on Intel CPUs.

vptest is 2 uops and doesn't macro-fuse with jcc, so equal or worse than movmsk / cmp for testing the result of a packed-compare. (See http://agner.org/optimize/



来源:https://stackoverflow.com/questions/47243456/comparing-2-vectors-in-avx-avx2-c

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!