How to save a pandas dataframe in gzipped format directly?

后端 未结 4 2504
迷失自我
迷失自我 2021-02-19 18:24

I have a pandas data frame, called df.

I want to save this in a gzipped format. One way to do this is the following:

import gzip
import pand         


        
4条回答
  •  予麋鹿
    予麋鹿 (楼主)
    2021-02-19 18:53

    For some reason, the Python zlib module has the ability to decompress gzip data, but it does not have the ability to directly compress to that format. At least as far as what is documented. This is despite the remarkably misleading documentation page header "Compression compatible with gzip".

    You can compress to the zlib format instead using zlib.compress or zlib.compressobj, and then strip the zlib header and trailer and add a gzip header and trailer, since both the zlib and gzip formats use the same compressed data format. This will give you data in the gzip format. The zlib header is fixed at two bytes and the trailer at four bytes, so those are easy to strip. Then you can prepend a basic gzip header of ten bytes: "\x1f\x8b\x08\0\0\0\0\0\0\xff" (C string format) and append a four-byte CRC in little-endian order. The CRC can be computed using zlib.crc32.

提交回复
热议问题