I am searching for a faster method of accomplishing this:
int is_empty(char * buf, int size)
{
int i;
for(i = 0; i < size; i++) {
if(buf[
The benchmarks given above (https://stackoverflow.com/a/1494499/2154139) are not accurate. They imply that func3 is much faster than the other options.
However, if you change the order of the tests, so that func3 comes before func2, you'd see func2 is much faster.
Careful when running combination benchmarks within a single execution... the side effects are large, especially when reusing the same variables. Better to run the tests isolated!
For example, changing it to:
int main(){
MEASURE( func3 );
MEASURE( func3 );
MEASURE( func3 );
MEASURE( func3 );
MEASURE( func3 );
}
gives me:
func3: zero 14243
func3: zero 1142
func3: zero 885
func3: zero 848
func3: zero 870
This was really bugging me as I couldn't see how func3 could perform so much faster than func2.
(apologize for the answer, and not as a comment, didn't have reputation)
Edit: Bad answer
A novel approach might be
int is_empty(char * buf, int size) {
char start = buf[0];
char end = buff[size-1];
buf[0] = 'x';
buf[size-1] = '\0';
int result = strlen(buf) == 0;
buf[0] = start;
buff[size-1] = end;
return result;
}
Why the craziness? because strlen is one of the library function that's more likely to be optimized. Storing and replacing the first character is to prevent the false positive. Storing and replacing the last character is to make sure it terminates.
Look at fast memcpy - it can be adapted for memcmp (or memcmp against a constant value).
With x86 you can use SSE to test 16 bytes at a time:
#include "smmintrin.h" // note: requires SSE 4.1
int is_empty(const char *buf, const size_t size)
{
size_t i;
for (i = 0; i + 16 <= size; i += 16)
{
__m128i v = _mm_loadu_si128((m128i *)&buf[i]);
if (!_mm_testz_si128(v, v))
return 0;
}
for ( ; i < size; ++i)
{
if (buf[i] != 0)
return 0;
}
return 1;
}
This can probably be further improved with loop unrolling.
On modern x86 CPUs with AVX you can even use 256 bit SIMD and test 32 bytes at a time.
Did anyone mention unrolling the loop? In any of these loops, the loop overhead and indexing is going to be significant.
Also, what is the probability that the buffer will actually be empty? That's the only case where you have to check all of it. If there typically is some garbage in the buffer, the loop should stop very early, so it doesn't matter.
If you plan to clear it to zero if it's not zero, it would probably be faster just to clear it with memset(buf, 0, sizeof(buf))
, whether or not it's already zero.
int is_empty(char * buf, int size)
{
return buf[0] == '\0';
}
If your buffer is not a character string, I think that's the fastest way to check...
memcmp() would require you to create a buffer the same size and then use memset to set it all as 0. I doubt that would be faster...