Bit Manipulation and Flags

问题

https://i.imgur.com/VU56Rwn.png

A: When the man page for open says: The flags specified are formed by or'ing the following values:

       O_RDONLY        open for reading only
       O_WRONLY        open for writing only
       ...

it means we should use a logical or between the flags like this: O_RDONLY || O_WRONLY to specify the combination of permissions we want.

B: To indicate different options we use bit flags (rather than characters or integers) in order to save space.

C: Performing operations on bit flags is fast.

D: Bit flags used in system calls are defined in library files.

E: The command chmod uses bit flag constants defined in octal because there are eight possible permission states.

I know bit flags are not defined in library files if it is a good system library. They are usually constants or #defines in the header, not the compiled object, if that is what "library files" refers to. However, I don't understand how they save space, aren't bit flags just integers after all?

回答1:

First of all, | and || are different operators.

| is the bit-wise OR operator which does an OR on every bit and you get the result of that.

|| is the logical OR operator which returns true if the left side is true or the right side is true, false otherwise.

In case of bit flag, you should use |, like O_RDONLY | O_WRONLY

B: To indicate different options we use bit flags (rather than characters or integers) in order to save space.

I think that this phrasing is a bit missleading, to be honest. The nice thing about bit flags, is that you can pack in a single int, mostly 32 bit long, up to 32 different values that have a ON/OFF semantic. So the function that takes these kind of options, only has to use a single int to get multiple options from the caller because the single bits represent such a on/off property. If the bit is 1, then the propery is ON, otherwise OFF.

In constract, when you don't use bit flags, the function would have to take a separate variable for every options and if you have many options, you would need a lot of variables in the function declaration. So you can "save space" if you use bit flags instead.

Consider this:

// lot's of other has_property_x cases

int somefunc(int has_property_a, int has_property_b, int has_property_c, ...)
{
    if(has_property_a)
        do_something_based_on_a();
    if(has_property_b)
        do_something_based_on_b();
    ....
}

void foo(void)
{
    somefunc(1, 1, 0, 1, 1);
}

This is not very efficient, it's hard to read, it's a pain in the butt to code, overall not a good design.

However if you use bit flags, you can save a lot of variables in the functions:

// lot's of other HAS_PROPERTY_X cases

#define HAS_PROPERTY_A 1
#define HAS_PROPERTY_B 2
#define HAS_PROPERTY_C 4
...

int somefunc(int flags)
{
    if(flags & HAS_PROPERTY_A)
        do_something_based_on_a();
    if(flags & HAS_PROPERTY_B)
        do_something_based_on_b();
    ...
}

void foo(void)
{
    somefunc(HAS_PROPERTY_A | HAS_PROPERTY_B | HAS_PROPERTY_E);
}

is much more compact and readable.

回答2:

A Brief Overview of What Bitflags are

Bit flags are constants that define a set of some kind, usually options of various kinds. Bit flags are typically defined as hexadecimal constants and the intent is to use the bitwise operators with these constants in order to create some subset out of the total set of constants by using the bitwise operators.

The bitwise operators are the operators such as | (bitwise Or), & (bitwise And), ^ (bitwise Exclusive Or), and ~ (bitwise Not) which perform bit by bit the Boolean logic operation designated by the operator on the two values to generate a new value. The bitwise operators are different from the logical operators such as || (logical Or), && (logical And) which are used with expressions that evaluate to a boolean value of true (non-zero) or false (zero).

An example of a typical definition using C Preprocessor define directives to create bitwise flags would be:

#define  ITM_FLAG_EXAMPLE1  0x00000001L   // an example bitwise flag
#define  ITM_FLAG_EXAMPLE2  0x00000002L   // another example flag
#define  ITM_FLAG_JUMP01    0x00040000L   // another example
#define  ITM_FLAG_JUMP02    0x00080000L   // another example

In addition modern C compilers will allow the use of enum types as well.

typedef enum { ITEM_FLAG_1 = 1, ITEM_FLAG_2 = 2, ITEM_FLAG_3 = 4 } ItemType;

ItemType AnItem = ITEM_FLAG_1;    // defining a variable of the type
ItemType AnItem2 = ITEM_FLAG_1 | ITEM_FLAG_2;  // defining a second variable

enum { ITEM_FLAG_1 = 1, ITEM_FLAG_2 = 2, ITEM_FLAG_3 = 4 } ItemType;

enum ItemType AnItem = ITEM_FLAG_1;    // defining a variable of the type
enum ItemType AnItem2 = ITEM_FLAG_1 | ITEM_FLAG_2;  // defining a second variable

And for a few examples of how these can be used:

unsigned long ulExample = ITM_FLAG_EXAMPLE2;   // ulExample contains 0x00000002L
unsigned long ulExamplex = ITM_FLAG_EXAMPLE1 | ITM_FLAG_EXAMPLE2;  // ulExamplex contains 0x00000003L
unsigned long ulExampley = ulExamplex & ITM_FLAG_EXAMPLE2;  // ulExampley contains 0x00000002L

See this blog posting, Intro to Truth Tables & Boolean Algebra, which describes the various Boolean Algebra operations and this Wikipedia topic on Truth Tables.

Some Considerations Using Bitflags

There can be a few gotchas when using bitflags and a few things to look out for.

use of a bitflag defined constant that is zero, no bits set, can cause problems
bitwise operations mixed with logical operations can be an area for defects

In general using a bitflag whose definition is a value of zero rather non-zero can lead to errors. Most programmers expect that a bitflag will have a non-zero value that represents membership in a set of some kind. A bitflag whose definition is zero is a kind of breach of expectations and when used in bitwise operations can lead to unexpected consequences and behavior.

When combining bitwise operations on bitflag variables along with logical operators, parenthesis to enforce specific operator precedence is generally more understandable as it does not require the reader to know the C operator precedence table.

On to the Posted Questions

When a library is provided there is typically one or more include files that accompany the library file in order to provide a number of needed items.

function prototypes for the functions provided by the library
variable type declarations and definitions for the types used by the library
special constants for operands and flags that govern the behavior of the library functions

Bit flags are a time honored way to provide options for a function interface. Bit flags have a couple of nice properties that make them appealing for a C programmer.

compact representation that is easy to transfer over an interface
a natural fit for Boolean Algebra operations and set manipulation
a natural fit with the C bitwise operators to perform those operations

Bit flags save space because the name of the flag may be long and descriptive but compile down to a single unsigned value such as unsigned short or unsigned long or unsigned char.

Another nice property of using bit flags is that when using bitwise operators on constants in an expression, most modern compilers will evaluate the bitwise operation as a part of compiling the expression. So a modern compiler will take multiple bitwise operators in an expression such as O_RDONLY | O_WRONLY and do the bitwise Or when compiling the source and replace the expression with the value of the evaluated expression.

In most computer architectures the bitwise operators are performed using registers into which the data is loaded and then the bitwise operation is performed. For a 32 bit architecture, using a 32 bit variable to contain a set of bits fits naturally into CPU registers just as in a 64 bit architecture, using either a 32 or 64 bit variable to contain a set of bits fits naturally into registers. This natural fit allows multiple bitwise operations on the same variables without having to do a fetch from the CPU cache or main memory.

The bitwise operators of C almost always have a CPU machine instruction analogue so that the C bitwise operators have an almost exactly similar CPU operation so the resulting machine code generated by the compiler is quite efficient.

The compact representation of bit flags can be easily seen by using an unsigned long to pass 32 different flags or an unsigned long long to pass 64 different flags to a function. An array of unsigned char can be used to pass many more flags by using an array offset and bit flag method along with a set of C Processor macros or a set of functions to manipulate the array.

Some Examples of What is Possible

The bitwise operators are very similar to the logical operators used with sets and representing sets with bit flags works out well. If you have a set that contains operands, some of which should not be used with some of the other flags then using the bitwise operators and masking of bits makes it easy to see if both of the conflicting flags are specified.

#define FLAG_1  0x00000001L     // a required flag if FLAG_2 is specified
#define FLAG_2  0x00001000L     // must not be specified with FLAG_3
#define FLAG_3  0x00002000L     // must not be specified with FLAG_2

int func (unsigned long ulFlags)
{
    // check if both FLAG_2 and FLAG_3 are specified. if so error
    // we do a bitwise And to isolate specific bits and then compare that
    // result with the bitwise Or of the bits for equality. this approach
    // makes sure that a check for both bits is turned on.
    if (ulFlags & (FLAG_2 | FLAG_3) == (FLAG_2 | FLAG_3)) return -1;

    // check to see if either FLAG_1 or FLAG_3 is set we can just do a
    // bitwise And against the two flags and if either one or both are set
    // then the result is non-zero.
    if (ulFlags & (FLAG_1 | FLAG_3)) {
        // do stuff if either or both FLAG_1 and/or FLAG_3 are set
    }

    // check that required option FLAG_1 is specified if FLAG_2 is specified.
    // we are using zero is boolean false and non-zero is boolean true in
    // the following. the ! is the logical Not operator so if FLAG_1 is
    // not set in ulFlags then the expression (ulFlags & FLAG_1) evaluates
    // to zero, False, and the Not operator inverts the False to True or
    // if FLAG_1 is set then (ulFlags & FLAG_1) evaluates to non-zero, True,
    // and the Not operator inverts the True to False. Both sides of the
    // logical And, &&, must evaluate True in order to trigger the return.
    if ((ulFlags & FLAG_2) && ! (ulFlags & FLAG_1)) return -2;
    // other stuff
}

For example see Using select() for non-blocking sockets for a brief overview of the standard socket() interface of using bit flags and the select() function for examples of using a set manipulation abstraction that is similar to what can be done with bit flags. The socket functions allow for the setting of various characteristics such as non-blocking through the use of bit flags with the fcntl() function and the select() function has an associated set of function/macros (FD_SET(), FD_ZERO(), etc.) that provides an abstraction for indicating which socket handles are to be monitored in the select(). I do not mean to imply that select() socket sets are bit maps, though in the original UNIX, I believe they were. However the abstract design of select() and its associated utilities provides a kind of sets that could be implemented with bit flags.

The evaluation of a variable containing bitwise flags can also be quicker and easier and more efficient and more readable. For instance in a function being called with some of the defined flags:

#define ITEM_FLG_01  0x0001
#define ITEM_FLG_02  0x0002
#define ITEM_FLG_03  0x0101
#define ITEM_FLG_04  0x0108
#define ITEM_FLG_05  0x0200

#define ITEM_FLG_SPL1 (ITEM_FLG_01 | ITEM_FLG_02)

there may be a switch() statement such as:

switch (bitwiseflags & ITEM_FLG_SPL1) {
    case ITEM_FLG_01 | ITEM_FLG_02:
        // do things if both ITEM_FLG_01 and ITEM_FLG_02 are both set
        break;
    case ITEM_FLG_01:
        // do things if ITEM_FLG_01  is set
        break;
    case ITEM_FLG_02:
        // do things if ITEM_FLG_02 is set
        break;
    default:
        // none of the flags we are looking for are set so error
        return -1;
}

And you can do some short and simple expressions such as the following using the same defines as above.

// test if bitwiseflags has bit ITEM_FLG_5 set and if so then call function
// doFunc().
(bitwiseflags & ITEM_FLG_5) == ITEM_FLG_5 && doFunc();

Addendum: A Technique for Really Large Sets of Bitflags

See this answer, Creating bitflag variables with large amounts of flags or how to create large bit-width numbers for an approach for large sets, larger than can fit in a 32 bit or 64 bit variable, of bitflags.

来源：https://stackoverflow.com/questions/49353209/bit-manipulation-and-flags

标签

bitflags