reading gzipped csv file in python 3

淺唱寂寞╮ 提交于 2020-03-18 04:31:06

问题


I'm having problems reading from a gzipped csv file with the gzip and csv libs. Here's what I got:

import gzip
import csv
import json

f = gzip.open(filename)
csvobj = csv.reader(f,delimiter = ',',quotechar="'")
for line in csvobj:
            ts = line[0]
            data_json = json.loads(line[1])

but this throws an exception:

 File "C:\Users\yaronol\workspace\raw_data_from_s3\s3_data_parser.py", line 64, in download_from_S3
    self.parse_dump_file(filename)
  File "C:\Users\yaronol\workspace\raw_data_from_s3\s3_data_parser.py", line 30, in parse_dump_file
    for line in csvobj:
_csv.Error: iterator should return strings, not bytes (did you open the file in text mode?)

gunzipping the file and opening that with csv works fine. I've also tried decoding the file text to convert from bytes to str...

What am I missing here?


回答1:


Default mode for gzip.open is rb, if you wish to work with strs, you have to specify it extra:

f = gzip.open(filename, mode="rt")

OT: it is a good practice to write I/O operations in a with block:

with gzip.open(filename, mode="rt") as f:



回答2:


You are opening the file in binary mode (which is the default for gzip).

Try instead:

import gzip
import csv
f = gzip.open(filename, mode='rt')
csvobj = csv.reader(f,delimiter = ',',quotechar="'")



回答3:


too late, you can use datatable package in python

import datatable as dt
df = dt.fread(filename)
df.head()


来源:https://stackoverflow.com/questions/30324503/reading-gzipped-csv-file-in-python-3

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!