How to encrypt a large dataset using python-gnupg without sucking up all the memory?

自闭症网瘾萝莉.ら 提交于 2020-05-13 14:28:07

问题


I have a very large text file on disk. Assume it is 1 GB or more. Also assume the data in this file has a \n character every 120 characters.

I am using python-gnupg to encrypt on this file. Since the file is so large, I cannot read the entire file into memory at one time.

However, the gnupg.encrypt() method that I'm using requires that I send in all the data at once -- not in chunks. So how can I encrypt the file without using up all my system memory?

Here is some sample code:

import gnupg
gpg = gnupg.GPG(gnupghome='/Users/syed.saqibali/.gnupg/')

for curr_line in open("VeryLargeFile.txt", "r").xreadlines():
    encrypted_ascii_data = gpg.encrypt(curr_line, "recipient@gmail.com")
    open("EncryptedOutputFile.dat", "a").write(encrypted_ascii_data)

This sample produces an invalid output file because I cannot simply concatenate encrypted blobs together into a file.


回答1:


Encrypting line for line results in a vast number of OpenPGP documents, which will not only make decryption more complicated, but also massively blow up the file size and computation effort.

The GnuPG module for Python also knows a method encrypt_file, which takes a stream as input and knows the optional output parameter to directly write the result to a file.

with open("VeryLargeFile.txt", "r") as infile:
    gpg.encrypt_file(
            infile, "recipient@example.org",
            output="EncryptedOutputFile.dat")

This results in a streaming behavior with constant and low memory requirements.




回答2:


I added a "b" (for binary) to the open command and it worked great for my code. Encrypting this way for some reason is slower than half the speed of encrpyting via shell/bash command though.



来源:https://stackoverflow.com/questions/35421076/how-to-encrypt-a-large-dataset-using-python-gnupg-without-sucking-up-all-the-mem

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!