Calculate/validate bz2 (bzip2) CRC32 in Python

后端 未结 2 727
我在风中等你
我在风中等你 2021-02-15 16:48

I\'m trying to calculate/validate the CRC32 checksums for compressed bzip2 archives.

.magic:16                       = \'BZ\' signature/magic number
.version:8           


        
相关标签:
2条回答
  • 2021-02-15 17:31

    To add onto the existing answer, there is a final checksum at the end of the stream (The one after eos_magic) It functions as a checksum for all the individual Huffman block checksums. It is initialized to zero. It is updated every time you have finished validating an existing Huffman block checksum. To update it, do as follows:

    crc: u32 = # latest validated Huffman block CRC
    ccrc: u32 = # current combined checksum
    
    ccrc = (ccrc << 1) | (ccrc >> 31);
    ccrc ^= crc;
    

    In the end, validate the value of ccrc against the 32-bit unsigned value you read from the compressed file.

    0 讨论(0)
  • 2021-02-15 17:49

    The following is the CRC algorithm used by bzip2, written in Python:

    crcVar = 0xffffffff # Init
        for cha in list(dataIn):
            crcVar = crcVar & 0xffffffff # Unsigned
            crcVar = ((crcVar << 8) ^ (BZ2_crc32Table[(crcVar >> 24) ^ (ord(cha))]))
    
        return hex(~crcVar & 0xffffffff)[2:-1].upper()
    

    (C code definitions can be found on lines 155-172 in bzlib_private.h)

    BZ2_crc32Table array/list can be found in crctable.c from the bzip2 source code. This CRC checksum algorithm is, quoting: "..vaguely derived from code by Rob Warnock, in Section 51 of the comp.compression FAQ..." (crctable.c)

    The checksums are calculated over the uncompressed data.

    Sources can be downloaded here: http://www.bzip.org/1.0.6/bzip2-1.0.6.tar.gz

    0 讨论(0)
提交回复
热议问题