I\'ve been studying C# and ran accross some familiar ground from my old work in C++. I never understood the reason for bitwise operators in a real application. I\'ve never u
Another typical (but I think less common) usage is to compose several numbers into one big number. An example for this can be the windows RGB macro:
#define RGB(r, g ,b) ((DWORD) (((BYTE) (r) | ((WORD) (g) << 8)) | (((DWORD) (BYTE) (b)) << 16)))
Where you take 3 bytes and compose an integer from them the represent the RGB value.
As to 'how they work': bitwise operations are one of the lowest level operations CPUs support, and in fact some bitwise operations, like NAND and NOR, are universal - you can build any operation at all out of a sufficiently large set of NAND gates. This may seem academic, but if you look at how things like adders are implemented in hardware, that's often basically what they boil down to.
As to the 'point': in a lot of higher level applications, of course, there is not much use for bit ops, but at the lowest levels of a system they are incredibly important. Just off the top of my head, things that would be very difficult to write without bit operations include device drivers, cryptographic software, error correction systems like RAID5 or erasure codes, checksums like CRC, video decoding software, memory allocators, or compression software.
They are also useful for maintaining large sets of integers efficiently, for instance in the fd_set
used by the common select
syscall, or when solving certain search/optimization problems.
Take a look at the source of an MPEG4 decoder, cryptography library, or operating system kernel sometime and you'll see many, many examples of bit operations being used.
I work in motion control (among other things) and the way you communicate with the drives is usually by using bit sequences. You set one bit pattern in a memory location x to set the motion profile, you set an enable bit at memory location y to start the motion, read a pattern from location z to get the status of the move, etc. The lower you go the more bit twiddling you have to perform.
A couple of examples:
Communication stacks: a header attached to data in a layer of a communication stack may contain bytes where individual bits within those bytes signify something, and so have to be masked before they can be processed. Similarly, when assembling the header in the response, individual bits will then need to be set or cleared.
Embedded software: embedded microcontrollers can have tens or hundreds of hardware registers, in which individual bits (or collections thereof) control different functions within the chip, or indicate the status of parts of the hardware.
Incidentally, in C and C++, bitfields are not recommended where portability is important, as the order of bits in a bitfield is compiler-dependent. Using masks instead guarantees which bit(s) will be set or cleared.
Minimizing Memory Use
Naturally a very generalized reason is to cram a lot of data into a small amount of memory. If you consider an array of booleans like this:
bool data[64] = {...};
That can take 64 bytes (512 bits) of memory. Meanwhile the same idea can be represented with bits using 8 bytes (64-bits) of memory:
uint64_t data = ...;
And of course we have a boatload of DRAM these days so it might not seem like it'd matter to compact all this data into the minimum size, but we're still dealing with, say, 64-bit general purpose registers. We're still dealing with 64 byte cache lines, e.g., and kilobytes per physically-mapped page, and moving data down the memory hierarchy is expensive. So if you're processing a boatload of data sequentially, for example, and you can reduce that down to 1/8th its size, often you'll be able to process a whole lot more of it in a shorter amount of time.
So a common use of the analogy above to store a bunch of booleans in a small amount of space is when bit flags are involved, like this:
enum Flags
{
flag_selected = 1 << 0,
flag_hidden = 1 << 1,
flag_removed = 1 << 2,
flag_hovering = 1 << 3,
flag_minimized = 1 << 4,
...
};
uint8_t flags = flag_selected | flag_hovering;
Operating on Multiple Bits at Once
But on top of cramming all this data into a smaller amount of space, you can also do things like test for multiple bits simultaneously:
// Check if the element is hidden or removed.
if (flags & (flag_hidden | flag_removed))
{
...
}
And a smart optimizer will typically reduce that down to a single bitwise and
if flag_hidden
and flag_removed
are literal constants known at compile-time.
As another example, let's go back to the example above:
bool data[64];
Let's say you wanted to test if all 64 booleans are set in which case you do something different. Given this type of representation, we might have to do this:
bool all_set = true;
for (int j=0; j < 64; ++j)
{
if (!data[j])
{
all_set = false;
break;
}
}
if (all_set)
{
// Do something different when all booleans are set.
...
}
And that's pretty expensive when the bitwise representation allows us to do this:
uint64_t data = ...;
if (data == 0xffffffffffffffff)
{
// Do something different when all bits are set.
...
}
This above version can perform the check for all 64 bits set in a single instruction on 64-bit machines. With SIMD registers, you can even test for more than 64 bits at a time with a single SIMD instruction.
As another example let's say you want to count how many of those booleans are set. In that case you might have to do this working with the boolean representation:
int count = 0;
for (int j=0; j < 64; ++j)
count += data[j];
// do something with count
Meanwhile if you used bitwise operations, you can do this:
uint64_t data = ...;
const int count = __popcnt64(data);
// do something with count
And some hardware can do that very efficiently as a native instruction. Others can still do it a whole, whole lot faster than looping through 64 booleans and counting the booleans set to true.
Efficient Arithmetic
Another common one is efficient arithmetic. If you have something like:
x = pow(2, n);
Where n
is a runtime variable, then you can often get a much more efficient result doing:
x = 1 << n;
Of course an optimizing compiler using intrinsics for pow
might be able to translate the former into the latter, but at least C and C++ compilers I've checked as of late cannot perform this optimization, at least when n
is not known at compile-time.
Whenever you're working with power of two, you can often do a lot of things efficiently with bitshifts and other bitwise operations. For example, take this:
x = n % power_of_two;
... where power_of_two
is a runtime variable that is always a power of two. In that case you can do:
x = n & (power_of_two - 1);
Which has the same effect as the modulo (only for power of two numbers). It's because a power of two value will always be a set bit followed by zeros. For example, 16 will be 0b10000
. If you subtract one from that, it becomes: 0b1111
, and using a bitwise and with that will effectively clear all upper bits in a way that gives you the analogical equivalent of n % 16
.
Similar thing with left-shifts to multiply by a power of two, right-shifts to divide by a power of two, etc. One of the main reasons a lot of hardware favored power of two image sizes like 16x16, 32x32, 64x64, 256x256, etc. is due to the efficient arithmetic enabled by it using bitwise instructions.
Conclusion
So anyway, this is a brief introduction to what you can do with bitwise operations and instructions from fast arithmetic, reduced memory use, and being able to perform operations on potentially many bits at once without looping through them and operating on them one bit at a time.
And they're still very relevant today in performance-critical fields. For example, if you look at the Atomontage voxel rendering engine, it claims to be able to rep a voxel in just about a single bit, and that's important not just to fit huge voxel data in DRAM but also to render it really quickly from smaller, fast memory like registers. Naturally it can't do that if it's going to use 8 bits just to store a true/false
kind of value which has to be checked individually.
Three major uses off of the top of my head:
1) In embedded applications, you often have to access memory-mapped registers, of which individual bits mean certain things (for instance the update bit in an ADC or serial register). This is more relevant to C++ than C#.
2) Calculations of checksums, like CRCs. These use shifts and masks very heavily. Before anybody says "use a standard library", I have come across non-standard checksums too many times, which have had to implemented from scratch.
3) When dealing with data which comes from another platform, which have a diffent bit or byte order (or both) from the one you are executing your code on. This is particularly true when doing software testing of embedded systems, receiving data across a network which has not been converted to network order, or processing bulk data from a data capture system. Check out the Wikipedia article on Endianness. If you are really interested, read the classic article On Holy Wars and a Call for Peace" by Danny Cohen.