Delimiting binary sequences

后端 未结 6 2013
甜味超标
甜味超标 2020-12-30 02:21

I need to be able to delimit a stream of binary data. I was thinking of using something like the ASCII EOT (End of Transmission) character to do this.

However I\'m a

6条回答
  •  礼貌的吻别
    2020-12-30 02:49

    @sarnold's answer is excellent, and here I want to share some code to illustrate it.

    First here is a wrong way to do it: using a \n delimiter. Don't do it! the binary data could contain \n, and it would be mixed up with the delimiters:

    import os, random
    with open('test', 'wb') as f:
        for i in range(100):                         # create 100 binary sequences of random
            length = random.randint(2, 100)          # length (between 2 and 100)
            f.write(os.urandom(length) + b'\n')      # separated with the character b"\n"
    with open('test', 'rb') as f:
        for i, l in enumerate(f):
            print(i, l)                              # oops we get 123 sequences! wrong!
    

    ...
    121 b"L\xb1\xa6\xf3\x05b\xc9\x1f\x17\x94'\n"
    122 b'\xa4\xf6\x9f\xa5\xbc\x91\xbf\x15\xdc}\xca\x90\x8a\xb3\x8c\xe2\x07\x96<\xeft\n'

    Now the right way to do it (option #4 in sarnold's answer):

    import os, random
    with open('test', 'wb') as f:
        for i in range(100):
            length = random.randint(2, 100)
            f.write(length.to_bytes(2, byteorder='little'))   # prepend the data with the length of the next data chunk, packed in 2 bytes
            f.write(os.urandom(length))
    with open('test', 'rb') as f:
        i = 0
        while True:
            l = f.read(2)     # read the length of the next chunk
            if l == b'':      # end of file
                break
            length = int.from_bytes(l, byteorder='little') 
            s = f.read(length)
            print(i, s)
            i += 1
    

    ...
    98 b"\xfa6\x15CU\x99\xc4\x9f\xbe\x9b\xe6\x1e\x13\x88X\x9a\xb2\xe8\xb7(K'\xf9+X\xc4"
    99 b'\xaf\xb4\x98\xe2*HInHp\xd3OxUv\xf7\xa7\x93Qf^\xe1C\x94J)'

提交回复
热议问题