Using csvreader against a gzipped file in Python

前端 未结 3 852
無奈伤痛
無奈伤痛 2020-12-25 11:35

I have a bunch of gzipped CSV files that I\'d like to open for inspection using Python\'s built in CSV reader. I\'d like to do this without having first to manually unzip t

相关标签:
3条回答
  • 2020-12-25 11:50

    Use the gzip module:

    with gzip.open(filename, mode='rt') as f:
        reader = csv.reader(f)
        #...
    
    0 讨论(0)
  • 2020-12-25 11:51

    I've tried the above version for writing and reading and it didn't work in Python 3.3 due to "bytes" error. However, after some trial and error I could get the following to work. Maybe it also helps others:

    import csv
    import gzip
    import io
    
    
    with gzip.open("test.gz", "w") as file:
        writer = csv.writer(io.TextIOWrapper(file, newline="", write_through=True))
        writer.writerow([1, 2, 3])
        writer.writerow([4, 5, 6])
    
    with gzip.open("test.gz", "r") as file:
        reader = csv.reader(io.TextIOWrapper(file, newline=""))
        print(list(reader))
    

    As amohr suggests, the following works as well:

    import gzip, csv
    
    with gzip.open("test.gz", "wt", newline="") as file:
        writer = csv.writer(file)
        writer.writerow([1, 2, 3])
        writer.writerow([4, 5, 6])
    
    with gzip.open("test.gz", "rt", newline="") as file:
        reader = csv.reader(file)
        print(list(reader))
    
    0 讨论(0)
  • 2020-12-25 11:55

    a more complete solution:

    import csv, gzip
    class GZipCSVReader:
        def __init__(self, filename):
            self.gzfile = gzip.open(filename)
            self.reader = csv.DictReader(self.gzfile)
        def next(self):
            return self.reader.next()
        def close(self):
            self.gzfile.close()
        def __iter__(self):
            return self.reader.__iter__()
    

    now you can use it like this:

    r = GZipCSVReader('my.csv')
    for map in r:
        for k,v in map:
            print k,v
    r.close()
    

    EDIT: following the below comment, how about a simpler approach:

    def gzipped_csv(filename):
        with gzip.open(filename) as f:
            r = csv.DictReader(f)
            for row in r:
                yield row
    

    which let's you then

    for row in gzipped_csv(filename):
        for k, v in row:
            print(k, v)
    
    0 讨论(0)
提交回复
热议问题