How to snappy compress a file using a python script

感情迁移 提交于 2019-12-10 11:55:35

问题


I am trying to compress in snappy format a csv file using a python script and the python-snappy module. This is my code so far:

import snappy
d = snappy.compress("C:\\Users\\my_user\\Desktop\\Test\\Test_file.csv")
with open("compressed_file.snappy", 'w') as snappy_data:
     snappy_data.write(d)
snappy_data.close()

This code actually creates a snappy file, but the snappy file created only contains a string: "C:\Users\my_user\Desktop\Test\Test_file.csv"

So I am a bit lost on getting my csv compressed. I got it done working on windows cmd with this command:

python -m snappy -c Test_file.csv compressed_file.snappy

But I need it to be done as a part of a python script, so working on cmd is not fine for me.

Thank you very much, Álvaro


回答1:


You are compressing the plain string, as the compress function takes raw data.

There are two ways to compress snappy data - as one block and the other as streaming (or framed) data

This function will compress a file using framed method

import snappy

def snappy_compress(path):
        path_to_store = path+'.snappy'

        with open(path, 'rb') as in_file:
          with open(path_to_store, 'w') as out_file:
            snappy.stream_compress(in_file, out_file)
            out_file.close()
            in_file.close()

        return path_to_store

snappy_compress('testfile.csv')

You can decompress from command line using:

python -m snappy -d testfile.csv.snappy testfile_decompressed.csv

It should be noted that the current framing used by python / snappy is not compatible with the framing used by Hadoop



来源:https://stackoverflow.com/questions/45711565/how-to-snappy-compress-a-file-using-a-python-script

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!