I want to encode/compress some binary image data as a sequence if bits. (This sequence will, in general, have a length that does not fit neatly in a whole number of standar
Since you have a mapping from symbols to 3-bit string, bitarray does a nice job of encoding and decoding lists of symbols to and from arrays of bits:
from bitarray import bitarray
from random import choice
symbols = {
'0' : bitarray('000'),
'a' : bitarray('001'),
'b' : bitarray('010'),
'c' : bitarray('011'),
'd' : bitarray('100'),
'e' : bitarray('101'),
'f' : bitarray('110'),
'g' : bitarray('111'),
}
seedstring = ''.join(choice(symbols.keys()) for _ in range(40))
# construct bitarray using symbol->bitarray mapping
ba = bitarray()
ba.encode(symbols, seedstring)
print seedstring
print ba
# what does bitarray look like internally?
ba_string = ba.tostring()
print repr(ba_string)
print len(ba_string)
Prints:
egb0dbebccde0gfdfbc0d0ccfcg0acgg0ccfga00
bitarray('10111101000010001010101001101110010100... etc.
'\xbd\x08\xaanQ\xf4\xc9\x88\x1b\xcf\x82\xff\r\xee@'
15
You can see that this 40-symbol list (120 bits) gets encoded into a 15-byte bitarray.