问题
I'm writing a parser for a binary format. This binary format involves different tables which are again in binary format containing varying field sizes usually (somewhere between 50 - 100 of them).
Most of these structures will have bitfields and will look something like these when represented in C:
struct myHeader
{
unsigned char fieldA : 3
unsigned char fieldB : 2;
unsigned char fieldC : 3;
unsigned short fieldD : 14;
unsigned char fieldE : 4
}
I came across the struct module but realized that its lowest resolution was a byte and not a bit, otherwise the module pretty much was the right fit for this work.
I know bitfields are supported using ctypes, but I'm not sure how to interface ctypes structs containing bitfields here.
My other option is to manipulate the bits myself and feed it into bytes and use it with the struct module - but since I have close to 50-100 different types of such structures, writing the code for that becomes more error-prone. I'm also worried about efficiency since this tool might be used to parse large gigabytes of binary data.
Thanks.
回答1:
Using bitstring (which you mention you're looking at) it should be easy enough to implement. First to create some data to decode:
>>> myheader = "3, 2, 3, 14, 4"
>>> a = bitstring.pack(myheader, 1, 0, 5, 1000, 2)
>>> a.bin
'00100101000011111010000010'
>>> a.tobytes()
'%\x0f\xa0\x80'
And then decoding it again is just
>>> a.readlist(myheader)
[1, 0, 5, 1000, 2]
Your main concern might well be the speed. The library is well optimised Python, but that's not nearly as fast as a C library would be.
回答2:
I haven't rigorously tested this, but it seems to work with unsigned types (edit: it works with signed byte/short types, too).
Edit 2: This is really hit or miss. It depends on the way the library's compiler packed the bits into the struct, which is not standardized. For example, with gcc 4.5.3 it works as long as I don't use the attribute to pack the struct, i.e. __attribute__ ((__packed__))
(so instead of 6 bytes it gets packed into 4 bytes, which you can check with __alignof__
and sizeof
). I can make it almost work by adding _pack_ = True
to the ctypes Structure definition, but it fails for fieldE. gcc notes: "Offset of packed bit-field ‘fieldE’ has changed in GCC 4.4".
import ctypes
class MyHeader(ctypes.Structure):
_fields_ = [
('fieldA', ctypes.c_ubyte, 3),
('fieldB', ctypes.c_ubyte, 2),
('fieldC', ctypes.c_ubyte, 3),
('fieldD', ctypes.c_ushort, 14),
('fieldE', ctypes.c_ubyte, 4),
]
lib = ctypes.cdll.LoadLibrary('C/bitfield.dll')
hdr = MyHeader()
lib.set_header(ctypes.byref(hdr))
for x in hdr._fields_:
print("%s: %d" % (x[0], getattr(hdr, x[0])))
Output:
fieldA: 3
fieldB: 1
fieldC: 5
fieldD: 12345
fieldE: 9
C:
typedef struct _MyHeader {
unsigned char fieldA : 3;
unsigned char fieldB : 2;
unsigned char fieldC : 3;
unsigned short fieldD : 14;
unsigned char fieldE : 4;
} MyHeader, *pMyHeader;
int set_header(pMyHeader hdr) {
hdr->fieldA = 3;
hdr->fieldB = 1;
hdr->fieldC = 5;
hdr->fieldD = 12345;
hdr->fieldE = 9;
return(0);
}
来源:https://stackoverflow.com/questions/7198388/accessing-bitfields-while-reading-writing-binary-data-structures