Faster approach to checking for an all-zero buffer in C?

后端未结

关注

 20  2184

I am searching for a faster method of accomplishing this:

int is_empty(char * buf, int size) 
{
    int i;
    for(i = 0; i < size; i++) {
        if(buf[


                      
              相关标签:


      
      
        
          20条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  清歌不尽        
                
              
                            
                2020-12-03 05:48
              
            
            
                                                                       
The benchmarks given above (https://stackoverflow.com/a/1494499/2154139) are not accurate. They imply that func3 is much faster than the other options.

However, if you change the order of the tests, so that func3 comes before func2, you'd see func2 is much faster. 

Careful when running combination benchmarks within a single execution... the side effects are large, especially when reusing the same variables. Better to run the tests isolated!

For example, changing it to:

int main(){
  MEASURE( func3 );
  MEASURE( func3 );
  MEASURE( func3 );
  MEASURE( func3 );
  MEASURE( func3 );
}


gives me:

func3: zero          14243
func3: zero           1142
func3: zero            885
func3: zero            848
func3: zero            870


This was really bugging me as I couldn't see how func3 could perform so much faster than func2.

(apologize for the answer, and not as a comment, didn't have reputation)
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  情话喂你        
                
              
                            
                2020-12-03 05:48
              
            
            
                                                                       
Edit: Bad answer

A novel approach might be

int is_empty(char * buf, int size) {
    char start = buf[0];
    char end = buff[size-1];
    buf[0] = 'x';
    buf[size-1] = '\0';
    int result = strlen(buf) == 0;
    buf[0] = start;
    buff[size-1] = end;
    return result;
}


Why the craziness? because strlen is one of the library function that's more likely to be optimized.
Storing and replacing the first character is to prevent the false positive. Storing and replacing the last character is to make sure it terminates.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  渐次进展        
                
              
                            
                2020-12-03 05:50
              
            
            
                                                                       
Look at fast memcpy - it can be adapted for memcmp (or memcmp against a constant value).
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  一生所求        
                
              
                            
                2020-12-03 05:53
              
            
            
                                                                       
With x86 you can use SSE to test 16 bytes at a time:

#include "smmintrin.h" // note: requires SSE 4.1

int is_empty(const char *buf, const size_t size) 
{
    size_t i;
    for (i = 0; i + 16 <= size; i += 16)
    {
        __m128i v = _mm_loadu_si128((m128i *)&buf[i]);
        if (!_mm_testz_si128(v, v))
            return 0;
    }
    for ( ; i < size; ++i)
    {
        if (buf[i] != 0)
            return 0;
    }
    return 1;
}


This can probably be further improved with loop unrolling.

On modern x86 CPUs with AVX you can even use 256 bit SIMD and test 32 bytes at a time.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  太阳男子        
                
              
                            
                2020-12-03 05:53
              
            
            
                                                                       
Did anyone mention unrolling the loop? In any of these loops, the loop overhead and indexing is going to be significant.

Also, what is the probability that the buffer will actually be empty? That's the only case where you have to check all of it.
If there typically is some garbage in the buffer, the loop should stop very early, so it doesn't matter.

If you plan to clear it to zero if it's not zero, it would probably be faster just to clear it with memset(buf, 0, sizeof(buf)), whether or not it's already zero.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  野趣味        
                
              
                            
                2020-12-03 05:54
              
            
            
                                                                       
int is_empty(char * buf, int size)
{
   return buf[0] == '\0';
}


If your buffer is not a character string, I think that's the fastest way to check...

memcmp() would require you to create a buffer the same size and then use memset to set it all as 0. I doubt that would be faster...
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
   
          
     上一页
1
2
3
4
下一页
           
           
        
                                  
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复