Is it worthwhile using C\'s bit-field implementation? If so, when is it ever used?
I was looking through some emulator code and it looks like the registers for the c
Bit fields were used in the olden days to save program memory.
They degrade performance because registers can not work with them so they have to be converted to integers to do anything with them. They tend to lead to more complex code that is unportable and harder to understand (since you have to mask and unmask things all the time to actually use the values.)
Check out the source for http://www.nethack.org/ to see pre ansi c in all its bitfield glory!
The primary purpose of bit-fields is to provide a way to save memory in massively instantiated aggregate data structures by achieving tighter packing of data.
The whole idea is to take advantage of situations where you have several fields in some struct type, which don't need the entire width (and range) of some standard data type. This provides you with the opportunity to pack several of such fields in one allocation unit, thus reducing the overall size of the struct type. And extreme example would be boolean fields, which can be represented by individual bits (with, say, 32 of them being packable into a single unsigned int
allocation unit).
Obviously, this only makes sense in situation where the pros of the reduced memory consumption outweigh the cons of slower access to values stored in bit-fields. However, such situations arise quite often, which makes bit-fields an absolutely indispensable language feature. This should answer your question about the modern use of bit-fields: not only they are used, they are essentially mandatory in any practically meaningful code oriented on processing large amounts of homogeneous data (like large graphs, for one example), because their memory-saving benefits greatly outweigh any individual-access performance penalties.
In a way, bit-fields in their purpose are very similar to such things as "small" arithmetic types: signed/unsigned char
, short
, float
. In the actual data-processing code one would not normally use any types smaller than int
or double
(with few exceptions). Arithmetic types like signed/unsigned char
, short
, float
exist just to serve as "storage" types: as memory-saving compact members of struct types in situations where their range (or precision) is known to be sufficient. Bit-fields is just another step in the same direction, that trades a bit more performance for much greater memory-saving benefits.
So, that gives us a rather clear set of conditions under which it is worthwhile to employ bit-fields:
If the conditions are met, you declare all bit-packable fields contiguously (typically at the end of the struct type), assign them their appropriate bit-widths (and, usually, take some steps to ensure that the bit-widths are appropriate). In most cases it makes sense to play around with ordering of these fields to achieve the best packing and/or performance.
There's also a weird secondary use of bit-fields: using them for mapping bit groups in various externally-specified representations, like hardware registers, floating-point formats, file formats etc. This has never been intended as a proper use of bit-fields, even though for some unexplained reason this kind of bit-field abuse continues to pop-up in real-life code. Just don't do this.
Boost.Thread uses bitfields in its shared_mutex
, on Windows at least:
struct state_data
{
unsigned shared_count:11,
shared_waiting:11,
exclusive:1,
upgrade:1,
exclusive_waiting:7,
exclusive_waiting_blocked:1;
};
In modern code, there's really only one reason to use bitfields: to control the space requirements of a bool
or an enum
type, within a struct/class. For instance (C++):
enum token_code { TK_a, TK_b, TK_c, ... /* less than 255 codes */ };
struct token {
token_code code : 8;
bool number_unsigned : 1;
bool is_keyword : 1;
/* etc */
};
IMO there's basically no reason not to use :1
bitfields for bool
, as modern compilers will generate very efficient code for it. In C, though, make sure your bool
typedef is either the C99 _Bool
or failing that an unsigned int, because a signed 1-bit field can hold only the values 0
and -1
(unless you somehow have a non-twos-complement machine).
With enumeration types, always use a size that corresponds to the size of one of the primitive integer types (8/16/32/64 bits, on normal CPUs) to avoid inefficient code generation (repeated read-modify-write cycles, usually).
Using bitfields to line up a structure with some externally-defined data format (packet headers, memory-mapped I/O registers) is commonly suggested, but I actually consider it a bad practice, because C doesn't give you enough control over endianness, padding, and (for I/O regs) exactly what assembly sequences get emitted. Have a look at Ada's representation clauses sometime if you want to see how much C is missing in this area.
Bit-fields are typically only used when there's a need to map structure fields to specific bit slices, where some hardware will be interpreting the raw bits. An example might be assembling an IP packet header. I can't see a compelling reason for an emulator to model a register using bit-fields, as it's never going to touch real hardware!
Whilst bit-fields can lead to neat syntax, they're pretty platform-dependent, and therefore non-portable. A more portable, but yet more verbose, approach is to use direct bitwise manipulation, using shifts and bit-masks.
If you use bit-fields for anything other than assembling (or disassembling) structures at some physical interface, performance may suffer. This is because every time you read or write from a bit-field, the compiler will have to generate code to do the masking and shifting, which will burn cycles.
One use for bitfields which hasn't yet been mentioned is that unsigned
bitfields provide arithmetic modulo a power-of-two "for free". For example, given:
struct { unsigned x:10; } foo;
arithmetic on foo.x
will be performed modulo 210 = 1024.
(The same can be achieved directly by using bitwise &
operations, of course - but sometimes it might lead to clearer code to have the compiler do it for you).