问题
I am reading the .gz file and converting to AVRO format. When I was using the codec='deflate'
. It is working fine. i.e., I was able to convert to avro format. When I use codec='snappy'
it is throwing an error stating below:
raise DataFileException("Unknown codec: %r" % codec)
avro.datafile.DataFileException: Unknown codec: 'snappy'
with deflate --> working fine
writer = DataFileWriter(open(avro_file, "wb"), DatumWriter(), schema, codec='deflate')
with snappy --> throwing an error
writer = DataFileWriter(open(avro_file, "wb"), DatumWriter(), schema, codec = "snappy")
a quick response would be a great help.
Thanks.
. .
回答1:
from avro/datafile.py
try:
import snappy
has_snappy = True
except ImportError:
has_snappy = False
...
# Codecs supported by container files:
VALID_CODECS = frozenset(['null', 'deflate'])
if has_snappy:
VALID_CODECS = frozenset.union(VALID_CODECS, ['snappy'])
so you have to install python-snappy lib
来源:https://stackoverflow.com/questions/39699490/issue-in-using-snappy-with-avro-in-python