How to check empty gzip file in Python

后端 未结 8 861
太阳男子
太阳男子 2021-01-12 01:46

I don\'t want to use OS commands as that makes it is OS dependent.

This is available in tarfile, tarfile.is_tarfile(filename), to check if

8条回答
  •  囚心锁ツ
    2021-01-12 02:28

    I had a few hundred thousand gzip files, only a few of which are zero-sized, mounted on a network share. I was forced to use the following optimization. It is brittle, but in the (very frequent) case in which you have a large number of files generated using the same method, the sum of all the bytes other than the name of the payload are a constant.

    Then you can check for a zero-sized payload by:

    1. Computing that constant over one file. You can code it up, but I find it simpler to just use command-line gzip (and this whole answer is an ugly hack anyway).
    2. examining only the inode for the rest of the files, instead of opening each file, which can be orders of magnitude faster:
    from os import stat
    from os.path import basename
    
    # YMMV with len_minus_file_name
    def is_gzip_empty(file_name, len_minus_file_name=23): 
        return os.stat(file_name).st_size - len(basename(file_name)) == len_minus_file_name
    

    This could break in many ways. Caveat emptor. Only use it if other methods are not practical.

提交回复
热议问题