can I trust that the C compiler does modulo 2^n each time I access a bit field? Or is there any compiler/optimisation where a code like the one below would not print out Overflo
Yes. We can get the answer from assembly. Here is a example I code in Ubuntu 16.04, 64bit, gcc.
#include <stdio.h>
typedef unsigned int uint32_t;
struct {
uint32_t foo1:8;
uint32_t foo2:24;
} G;
int main() {
G.foo1 = 0x12;
G.foo2 = 0xffffff; // G is 0xfffff12
printf("G.foo1=0x%02x, G.foo2=0x%06x, G=0x%08x\n", G.foo1, G.foo2, *(uint32_t *)&G);
G.foo2++; // G.foo2 overflow
printf("G.foo1=0x%02x, G.foo2=0x%06x, G=0x%08x\n", G.foo1, G.foo2, *(uint32_t *)&G);
G.foo1 += (0xff-0x12+1); // // G.foo1 overflow
printf("G.foo1=0x%02x, G.foo2=0x%06x, G=0x%08x\n", G.foo1, G.foo2, *(uint32_t *)&G);
return 0;
}
Compile it with gcc -S <.c file>
. You can get the assembly file .s
. Here I show the assembly of G.foo2++;
, and I write some comments.
movl G(%rip), %eax
shrl $8, %eax # 0xfffff12-->0x00ffffff
addl $1, %eax # 0x00ffffff+1=0x01000000
andl $16777215, %eax # 16777215=0xffffff, so eax still 0x01000000
sall $8, %eax # 0x01000000-->0x00000000
movl %eax, %edx # edx high-24bit is fool2
movl G(%rip), %eax # G.foo2, tmp123
movzbl %al, %eax # so eax=0x00000012
orl %edx, %eax # eax=0x00000012 | 0x00000000 = 0x00000012
movl %eax, G(%rip) # write to G
We can see that compiler will use shift instructions to ensure what you say.(note: here's memory layout of G is:
----------------------------------
| foo2-24bit | foo1-8bit |
----------------------------------
Of course, the result of aforementioned is:
G.foo1=0x12, G.foo2=0xffffff, G=0xffffff12
G.foo1=0x12, G.foo2=0x000000, G=0x00000012
G.foo1=0x00, G.foo2=0x000000, G=0x00000000
Short answer: yes, you can trust modulo 2^n to happen.
In your program,
G.foo++;
is in fact equivalent to G.foo = (unsigned int)G.foo + 1
.
Unsigned int arithmetic always produces 2^(size of unsigned int in bits) results. The two bits of least weight are then stored in G.foo
, producing zero.
Yes, you can trust the C compiler to do the right thing here, as long as the bit field is declared with an unsigned type, which you have with uint8_t
. From the C99 standard §6.2.6.1/3:
Values stored in unsigned bit-fields and objects of type unsigned char shall be represented using a pure binary notation.40)
From §6.7.2.1/9:
A bit-field is interpreted as a signed or unsigned integer type consisting of the specified number of bits.104) If the value 0 or 1 is stored into a nonzero-width bit-field of type
_Bool
, the value of the bit-field shall compare equal to the value stored.
And from §6.2.5/9 (emphasis mine):
The range of nonnegative values of a signed integer type is a subrange of the corresponding unsigned integer type, and the representation of the same value in each type is the same.31) A computation involving unsigned operands can never overflow, because a result that cannot be represented by the resulting unsigned integer type is reduced modulo the number that is one greater than the largest value that can be represented by the resulting type.
So yes, you can be sure that any standards-conforming compiler will have G.foo
overflow to 0 without any other unwanted side effects.
No. The compiler allocates 2 bits to the field, and incrementing 3 results in 100b, which when placed in two bits results in 0.