Faster approach to checking for an all-zero buffer in C?

后端 未结 20 2184
孤独总比滥情好
孤独总比滥情好 2020-12-03 05:33

I am searching for a faster method of accomplishing this:

int is_empty(char * buf, int size) 
{
    int i;
    for(i = 0; i < size; i++) {
        if(buf[         


        
相关标签:
20条回答
  • 2020-12-03 05:48

    The benchmarks given above (https://stackoverflow.com/a/1494499/2154139) are not accurate. They imply that func3 is much faster than the other options.

    However, if you change the order of the tests, so that func3 comes before func2, you'd see func2 is much faster.

    Careful when running combination benchmarks within a single execution... the side effects are large, especially when reusing the same variables. Better to run the tests isolated!

    For example, changing it to:

    int main(){
      MEASURE( func3 );
      MEASURE( func3 );
      MEASURE( func3 );
      MEASURE( func3 );
      MEASURE( func3 );
    }
    

    gives me:

    func3: zero          14243
    func3: zero           1142
    func3: zero            885
    func3: zero            848
    func3: zero            870
    

    This was really bugging me as I couldn't see how func3 could perform so much faster than func2.

    (apologize for the answer, and not as a comment, didn't have reputation)

    0 讨论(0)
  • 2020-12-03 05:48

    Edit: Bad answer

    A novel approach might be

    int is_empty(char * buf, int size) {
        char start = buf[0];
        char end = buff[size-1];
        buf[0] = 'x';
        buf[size-1] = '\0';
        int result = strlen(buf) == 0;
        buf[0] = start;
        buff[size-1] = end;
        return result;
    }
    

    Why the craziness? because strlen is one of the library function that's more likely to be optimized. Storing and replacing the first character is to prevent the false positive. Storing and replacing the last character is to make sure it terminates.

    0 讨论(0)
  • 2020-12-03 05:50

    Look at fast memcpy - it can be adapted for memcmp (or memcmp against a constant value).

    0 讨论(0)
  • 2020-12-03 05:53

    With x86 you can use SSE to test 16 bytes at a time:

    #include "smmintrin.h" // note: requires SSE 4.1
    
    int is_empty(const char *buf, const size_t size) 
    {
        size_t i;
        for (i = 0; i + 16 <= size; i += 16)
        {
            __m128i v = _mm_loadu_si128((m128i *)&buf[i]);
            if (!_mm_testz_si128(v, v))
                return 0;
        }
        for ( ; i < size; ++i)
        {
            if (buf[i] != 0)
                return 0;
        }
        return 1;
    }
    

    This can probably be further improved with loop unrolling.

    On modern x86 CPUs with AVX you can even use 256 bit SIMD and test 32 bytes at a time.

    0 讨论(0)
  • 2020-12-03 05:53

    Did anyone mention unrolling the loop? In any of these loops, the loop overhead and indexing is going to be significant.

    Also, what is the probability that the buffer will actually be empty? That's the only case where you have to check all of it. If there typically is some garbage in the buffer, the loop should stop very early, so it doesn't matter.

    If you plan to clear it to zero if it's not zero, it would probably be faster just to clear it with memset(buf, 0, sizeof(buf)), whether or not it's already zero.

    0 讨论(0)
  • 2020-12-03 05:54
    int is_empty(char * buf, int size)
    {
       return buf[0] == '\0';
    }
    

    If your buffer is not a character string, I think that's the fastest way to check...

    memcmp() would require you to create a buffer the same size and then use memset to set it all as 0. I doubt that would be faster...

    0 讨论(0)
提交回复
热议问题