Faster approach to checking for an all-zero buffer in C?

后端 未结 20 2182
孤独总比滥情好
孤独总比滥情好 2020-12-03 05:33

I am searching for a faster method of accomplishing this:

int is_empty(char * buf, int size) 
{
    int i;
    for(i = 0; i < size; i++) {
        if(buf[         


        
相关标签:
20条回答
  • 2020-12-03 05:38

    I think I have a good solution for this. Create a dummy zeroed array and use memcmp(). Thats what I do.

    0 讨论(0)
  • 2020-12-03 05:43

    You stated in your question that you are looking for a most likely unnecessary micro-optimization. In 'normal' cases the ASM approach by Thomas and others should give you the fastest results.

    Still, this is forgetting the big picture. If your buffer is really large, then starting from the start and essential do a linear search is definitely not the fastest way to do this. Assume your cp replacement is quite good at finding large consecutive empty regions but has a few non-empty bytes at the end of the array. All linear searches would require reading the whole array. On the other hand a quicksort inspired algorithm could search for any non-zero elements and abort much faster for a large enough dataset.

    So before doing any kind of micro-optimization I would look closely at the data in your buffer and see if that gives you any patterns. For a single '1', randomly distributed in the buffer a linear search (disregarding threading/parallelization) will be the fastest approach, in other cases not necessarily so.

    0 讨论(0)
  • 2020-12-03 05:44

    What about looping from size to zero (cheaper checks):

    int is_empty(char * buf, int size) 
    {
        while(size --> 0) {
            if(buf[i] != 0) return 0;
        }
        return 1;
    }
    

    It must be noted that we probably cannot outperform the compiler, so enable the most aggressive speed optimization in your compiler and assume that you're likely to not go any faster.

    Or handling everything using pointers (not tested, but likely to perform quite good):

    int is_empty(char* buf, int size)
    {
        char* org = buf;
    
        if (buf[size-1] == 1)
            return 0;
    
        buf[size-1] = 1;
        while(! *buf++);
        buf--;
    
        return buf == org[size-1];
    }
    
    0 讨论(0)
  • 2020-12-03 05:44

    Inline assembly version of the initial C code (no error checking, if uiSize is == 0 and/or the array is not allocated exceptions will be generated. Perhaps use try {} catch() as this might be faster than adding a lot of check to the code. Or do as I do, try not to call functions with invalid values (usually does not work). At least add a NULL pointer check and a size != 0 check, that is very easy.

     unsigned int IsEmpty(char* pchBuffer, unsigned int uiSize)
     {
        asm {
          push esi
          push ecx         
    
          mov esi, [pchBuffer]
          mov ecx, [uiSize]
    
          // add NULL ptr and size check here
    
          mov eax, 0
    
        next_char:
          repe scasb           // repeat string instruction as long as BYTE ptr ds:[ESI] == 0
                               // scasb does pointer arithmetic for BYTES (chars), ie it copies a byte to al and increments ESI by 1
          cmp cx,0             // did the loop complete?
          je all_chars_zero    // yes, array is all 0
          jmp char_not_zero    // no, loop was interrupted due to BYTE PTR ds:[ESI] != 0
    
        all_chars_zero:        
          mov eax, 1           // Set return value (works in MASM)
          jmp end  
    
        char_not_zero:
          mov eax, 0          // Still not sure if this works in inline asm
    
        end:
          pop ecx
          pop esi          
      }
    }
    

    That is written on the fly, but it looks correct enough, corrections are welcome. ANd if someone known about how to set the return value from inline asm, please do tell.

    0 讨论(0)
  • 2020-12-03 05:46

    Four functions for testing zeroness of a buffer with simple benchmarking:

    #include <stdio.h> 
    #include <string.h> 
    #include <wchar.h> 
    #include <inttypes.h> 
    
    #define SIZE (8*1024) 
    char zero[SIZE] __attribute__(( aligned(8) ));
    
    #define RDTSC(var)  __asm__ __volatile__ ( "rdtsc" : "=A" (var)); 
    
    #define MEASURE( func ) { \ 
      uint64_t start, stop; \ 
      RDTSC( start ); \ 
      int ret = func( zero, SIZE ); \ 
      RDTSC( stop ); \ 
      printf( #func ": %s   %12"PRIu64"\n", ret?"non zero": "zero", stop-start ); \ 
    } 
    
    
    int func1( char *buff, size_t size ){
      while(size--) if(*buff++) return 1;
      return 0;
    }
    
    int func2( char *buff, size_t size ){
      return *buff || memcmp(buff, buff+1, size-1);
    }
    
    int func3( char *buff, size_t size ){
      return *(uint64_t*)buff || memcmp(buff, buff+sizeof(uint64_t), size-sizeof(uint64_t));
    }
    
    int func4( char *buff, size_t size ){
      return *(wchar_t*)buff || wmemcmp((wchar_t*)buff, (wchar_t*)buff+1, size/sizeof(wchar_t)-1);
    }
    
    int main(){
      MEASURE( func1 );
      MEASURE( func2 );
      MEASURE( func3 );
      MEASURE( func4 );
    }
    

    Result on my old PC:

    func1: zero         108668
    func2: zero          38680
    func3: zero           8504
    func4: zero          24768
    0 讨论(0)
  • 2020-12-03 05:48

    One potential way, inspired by Kieveli's dismissed idea:

    int is_empty(char *buf, size_t size)
    {
        static const char zero[999] = { 0 };
        return !memcmp(zero, buf, size > 999 ? 999 : size);
    }
    

    Note that you can't make this solution work for arbitrary sizes. You could do this:

    int is_empty(char *buf, size_t size)
    {
        char *zero = calloc(size);
        int i = memcmp(zero, buf, size);
        free(zero);
        return i;
    }
    

    But any dynamic memory allocation is going to be slower than what you have. The only reason the first solution is faster is because it can use memcmp(), which is going to be hand-optimized in assembly language by the library writers and will be much faster than anything you could code in C.

    EDIT: An optimization no one else has mentioned, based on earlier observations about the "likelyness" of the buffer to be in state X: If a buffer isn't empty, will it more likely not be empty at the beginning or the end? If it's more likely to have cruft at the end, you could start your check at the end and probably see a nice little performance boost.

    EDIT 2: Thanks to Accipitridae in the comments:

    int is_empty(char *buf, size_t size)
    {
        return buf[0] == 0 && !memcmp(buf, buf + 1, size - 1);
    }
    

    This basically compares the buffer to itself, with an initial check to see if the first element is zero. That way, any non-zero elements will cause memcmp() to fail. I don't know how this would compare to using another version, but I do know that it will fail quickly (before we even loop) if the first element is nonzero. If you're more likely to have cruft at the end, change buf[0] to buf[size] to get the same effect.

    0 讨论(0)
提交回复
热议问题