Is there a portable alternative to C++ bitfields

前端未结

关注

 5  1270

There are many situations (especially in low-level programming), where the binary layout of the data is important. For example: hardware/driver manipulation, network protocols,

相关标签:

5条回答

闹比i

2021-02-05 07:13
From the C++14 standard (N3797 draft), section 9.6 [class.bit], paragraph 1:

Allocation of bit-fields within a class object is implementation-defined. Alignment of bit-fields is implementation-defined. Bit-fields are packed into some addressable allocation unit. [ Note: Bit-fields straddle allocation units on some machines and not on others. Bit-fields are assigned right-to-left on some machines, left-to-right on others. — end note ]

Although notes are non-normative, every implementation I'm aware of uses one of two layouts: either big-endian or little endian bit order.

Note that:
- You must specify padding manually. This implies that you must know the size of your types (e.g. by using <cstdint>).
- You must use unsigned types.
- The preprocessor macros for detecting the bit order are implementation-dependent.
- Usually the bit order endianness is the same as the byte order endianness. I believe there is a compiler flag to override it, though, but I can't find it.
For examples, look in netinet/tcp.h and other nearby headers.

Edit by OP: for example tcp.h defines
```
struct
{
    u_int16_t th_sport;     /* source port */
    u_int16_t th_dport;     /* destination port */
    tcp_seq th_seq;     /* sequence number */
    tcp_seq th_ack;     /* acknowledgement number */
# if __BYTE_ORDER == __LITTLE_ENDIAN
    u_int8_t th_x2:4;       /* (unused) */
    u_int8_t th_off:4;      /* data offset */
# endif
# if __BYTE_ORDER == __BIG_ENDIAN
    u_int8_t th_off:4;      /* data offset */
    u_int8_t th_x2:4;       /* (unused) */
# endif
    // ...
}
```
And since it works with mainstream compilers, it means bitset's memory layout is reliable in practice.

Edit:

This is portable within one endianness:
```
struct Foo {
    uint16_t x: 10;
    uint16_t y: 6;
};
```
But this may not be because it straddles a 16-bit unit:
```
struct Foo {
    uint16_t x: 10;
    uint16_t y: 12;
    uint16_t z: 10;
};
```
And this may not be because it has implicit padding:
```
struct Foo {
    uint16_t x: 10;
};
```
0 讨论(0)
发布评论:

提交评论
- 加载中...

不要未来只要你来

2021-02-05 07:25

It's simple to implement bit fields with known positions with C++:

template<typename T, int POS, int SIZE>
struct BitField {
    T *data;

    BitField(T *data) : data(data) {}

    operator int() const {
        return ((*data) >> POS) & ((1ULL << SIZE)-1);
    }

    BitField& operator=(int x) {
        T mask( ((1ULL << SIZE)-1) << POS );
        *data = (*data & ~mask) | ((x << POS) & mask);
        return *this;
    }
};

The above toy implementation allows for example to define a 12-bit field in a unsigned long long variable with

unsigned long long var;

BitField<unsigned long long, 7, 12> muxno(&var);

and the generated code to access the field value is just

0000000000000020 <_Z6getMuxv>:
  20:   48 8b 05 00 00 00 00    mov    0x0(%rip),%rax  ; Get &var
  27:   48 8b 00                mov    (%rax),%rax     ; Get content
  2a:   48 c1 e8 07             shr    $0x7,%rax       ; >> 7
  2e:   25 ff 0f 00 00          and    $0xfff,%eax     ; keep 12 bits
  33:   c3                      retq

Basically what you'd have to write by hand

0 讨论(0)

心在旅途

2021-02-05 07:32
We have this in production code where we had to port MIPS code to x86-64

https://codereview.stackexchange.com/questions/54342/template-for-endianness-free-code-data-always-packed-as-big-endian

Works well for us.

It's basically a template without any storage, the template arguments specify the position of the relevant bits.

If you need multiple fields, you put multiple specializations of the template together in a union, together with an array of bytes to provide storage.

The template has overloads for assignment of value and a conversion operator to unsigned for reading the value.

In addition, if the fields are larger than a byte, they are stored in big-endian byte order, which is sometimes useful when implementing cross-platform protocols.

here's a usage example:
```
union header
{
    unsigned char arr[2];       // space allocation, 2 bytes (16 bits)

    BitFieldMember<0, 4> m1;     // first 4 bits
    BitFieldMember<4, 5> m2;     // The following 5 bits
    BitFieldMember<9, 6> m3;     // The following 6 bits, total 16 bits
};

int main()
{
    header a;
    memset(a.arr, 0, sizeof(a.arr));
    a.m1 = rand();
    a.m3 = a.m1;
    a.m2 = ~a.m1;
    return 0;
}
```
0 讨论(0)
发布评论:

提交评论
- 加载中...
再見小時候

2021-02-05 07:33
I have written an implementation of bit fields in C++ as a library header file. An example I give in the documentation is that, instead of writing this:
```
struct A
  {
    union
      {
        struct
          {
            unsigned x : 5;
            unsigned a0 : 2;
            unsigned a1 : 2;
            unsigned a2 : 2;
          }
        u;
        struct
          {
            unsigned x : 5;
            unsigned all_a : 6;
          }
        v;
      };
  };

// …

A x;
x.v.all_a = 0x3f;
x.u.a1 = 0;
```
you can write:
```
typedef Bitfield<Bitfield_traits_default<> > Bf;

struct A : private Bitfield_fmt
  {
    F<5> x;
    F<2> a[3];
  };

typedef Bitfield_w_fmt<Bf, A> Bwf;

// …

Bwf::Format::Define::T x;
BITF(Bwf, x, a) = 0x3f;
BITF(Bwf, x, a[1]) = 0;
```
There's an alternative interface, under which the last two lines of the above would change to:
```
#define BITF_U_X_BWF Bwf
#define BITF_U_X_BASE x
BITF(X, a) = 0x3f;
BITF(X, a[1]) = 0;
```
Using this implementation of bit fields, the traits template parameter gives the programmer a lot of flexibility. Memory is just processor memory by default, or it can be an abstraction, with the programmer providing functions to perform "memory" reads and writes. The abstracted memory is a sequence of elements of any unsigned integral type (chosen by the programmer). Fields can be laid out either from least-to-most or most-to-least significance. The layout of fields in memory can be the reverse of what they are in the format structure.

The implementation is located at: https://github.com/wkaras/C-plus-plus-library-bit-fields

(As you can see, I unfortunately was not able to fully avoid use of macros.)
0 讨论(0)
发布评论:

提交评论
- 加载中...
借酒劲吻你

2021-02-05 07:33

C is designed for low-level bit manipulation. It's easy enough to declare a buffer of unsigned chars, and set it to any bit pattern you want. Especially if your bit strings are very short so fit into one of the integral types.

One potential problem is byte endiannness. C can't "see" this at all, but just as integers have an endianness, so too do bytes, when serialised. Another is the very small number of machines that don't use octets for bytes. C guarantees a byte shall be at least an octet, but 32 and 9 are real-world implementations. In those circumstances, you have to take a decision whether to simply ignore the uper bits (in which case naive code should work), or treat them as part of the bitstream (in which case you've got to be careful to fold CHAR_BIT into your calculations). It's also hard to test the code as you unlikely to find it easy to get your hands on a CHAR+BIT 32 machine.

0 讨论(0)
发布评论:

提交评论
- 加载中...