I was confronted with a tricky (IMO) question. I needed to compare two MAC addresses, in the most efficient manner.
The only thought that crossed my mind in that moment
If he is really dissatisfied with this approach (which is essentially a brain fart, since you aren't comparing megabytes or gigabytes of data, so one shan't really be worrying about "efficiency" and "speed" in this case), just tell him that you trust the quality and speed of the standard library:
int isEqual(MAC* addr1, MAC* addr2)
{
return memcmp(&addr1->data, &addr2->data, sizeof(addr1->data)) == 0;
}
You have a MAC structure (which contains an array of 6 bytes),
typedef struct {
char data[6];
} MAC;
Which agrees with this article about typedef for fixed length byte array.
The naive approach would be to assume the MAC address is word aligned (which is probably what the interviewer wanted), albeit not guaranteed.
typedef unsigned long u32;
typedef signed long s32;
typedef unsigned short u16;
typedef signed short s16;
int
MACcmp(MAC* mac1, MAC* mac2)
{
if(!mac1 || !mac2) return(-1); //check for NULL args
u32 m1 = *(u32*)mac1->data;
U32 m2 = *(u32*)mac2->data;
if( m1 != m2 ) return (s32)m1 - (s32)m2;
u16 m3 = *(u16*)(mac1->data+4);
u16 m2 = *(u16*)(mac2->data+4);
return (s16)m3 - (s16)m4;
}
Slightly safer would be to interpret the char[6] as a short[3] (MAC more likely to be aligned on even byte boundaries than odd),
typedef unsigned short u16;
typedef signed short s16;
int
MACcmp(MAC* mac1, MAC* mac2)
{
if(!mac1 || !mac2) return(-1); //check for NULL args
u16* p1 = (u16*)mac1->data;
u16* p2 = (u16*)mac2->data;
for( n=0; n<3; ++n ) {
if( *p1 != *p2 ) return (s16)*p1 - (s16)*p2;
}
return(0);
}
Assume nothing, and copy to word aligned storage, but the only reason for typecasting here is to satisfy the interviewer,
typedef unsigned short u16;
typedef signed short s16;
int
MACcmp(MAC* mac1, MAC* mac2)
{
if(!mac1 || !mac2) return(-1); //check for NULL args
u16 m1[3]; u16 p2[3];
memcpy(m1,mac1->data,6);
memcpy(m2,mac2->data,6);
for( n=0; n<3; ++n ) {
if( m1[n] != m2[n] ) return (s16)m1[n] - (s16)m2[n];
}
return(0);
}
Save yourself lots of work,
int
MACcmp(MAC* mac1, MAC* mac2)
{
if(!mac1 || !mac2) return(-1);
return memcmp(mac1->data,mac2->data,6);
}
There is nothing wrong with an efficient implementation, for all you know this has been determined to be hot code that is called many many times. And in any case, its okay for interview questions to have odd constraints.
Logical AND is a priori a branching instruction due to short-circuit evaluation even if it doesn't compile this way, so lets avoid it, we don't need it. Nor do we need to convert our return value to a true bool (true or false, not 0 or anything that's not zero).
Here is a fast solution on 32-bit: XOR will capture the differences, OR will record difference in both parts, and NOT will negate the condition into EQUALS, not UNEQUAL. The LHS and RHS are independent computations, so a superscalar processor can do this in parallel.
int isEqual(MAC* addr1, MAC* addr2)
{
return ~((*(int*)addr2 ^ *(int*)addr1) | (int)(((short*)addr2)[2] ^ ((short*)addr1)[2]));
}
EDIT
The purpose of the above code was to show that this could be done efficiently without branching. Comments have pointed out this C++ classifies this as undefined behavior. While true, VS handles this fine. Without changing the interviewer's struct definition and function signature, in order to avoid undefined behavior an extra copy must be made. So the non-undefined behavior way without branching but with an extra copy would be as follows:
int isEqual(MAC* addr1, MAC* addr2)
{
struct IntShort
{
int i;
short s;
};
union MACU
{
MAC addr;
IntShort is;
};
MACU u1;
MACU u2;
u1.addr = *addr1; // extra copy
u2.addr = *addr2; // extra copy
return ~((u1.is.i ^ u2.is.i) | (int)(u1.is.s ^ u2.is.s)); // still no branching
}
This would work on most systems,and be faster than your solution.
int isEqual(MAC* addr1, MAC* addr2)
{
return ((int32*)addr1)[0] == ((int32*)addr2)[0] && ((int16*)addr1)[2] == ((int16*)addr2)[2];
}
would inline nicely too, could be handy at the center of loop on a system where you can check the details are viable.