How to unzip gz file using Python

后端 未结 6 1833
伪装坚强ぢ
伪装坚强ぢ 2020-12-01 02:37

I need to extract a gz file that I have downloaded from an FTP site to a local Windows file server. I have the variables set for the local path of the file, and I know it ca

相关标签:
6条回答
  • 2020-12-01 02:53

    Not an exact answer because you're using xml data and there is currently no pd.read_xml() function (as of v0.23.4), but pandas (starting with v0.21.0) can uncompress the file for you! Thanks Wes!

    import pandas as pd
    import os
    fn = '../data/file_to_load.json.gz'
    print(os.path.isfile(fn))
    df = pd.read_json(fn, lines=True, compression='gzip')
    df.tail()
    
    0 讨论(0)
  • 2020-12-01 03:02

    Maybe you want pass it to pandas also.

    with gzip.open('features_train.csv.gz') as f:
    
        features_train = pd.read_csv(f)
    
    features_train.head()
    
    0 讨论(0)
  • 2020-12-01 03:09
    from sh import gunzip
    
    gunzip('/tmp/file1.gz')
    
    0 讨论(0)
  • 2020-12-01 03:10

    If you are parsing the file after unzipping it, don't forget to use decode() method, is necessary when you open a file as binary.

    import gzip
    with gzip.open(file.gz, 'rb') as f:
        for line in f:
            print(line.decode().strip())
    
    0 讨论(0)
  • 2020-12-01 03:12

    From the documentation:

    import gzip
    f = gzip.open('file.txt.gz', 'rb')
    file_content = f.read()
    f.close()
    
    0 讨论(0)
  • 2020-12-01 03:15
    import gzip
    import shutil
    with gzip.open('file.txt.gz', 'rb') as f_in:
        with open('file.txt', 'wb') as f_out:
            shutil.copyfileobj(f_in, f_out)
    
    0 讨论(0)
提交回复
热议问题