问题
I have a very large text file on disk. Assume it is 1 GB or more. Also assume the data in this file has a \n
character every 120 characters.
I am using python-gnupg to encrypt on this file. Since the file is so large, I cannot read the entire file into memory at one time.
However, the gnupg.encrypt()
method that I'm using requires that I send in all the data at once -- not in chunks. So how can I encrypt the file without using up all my system memory?
Here is some sample code:
import gnupg
gpg = gnupg.GPG(gnupghome='/Users/syed.saqibali/.gnupg/')
for curr_line in open("VeryLargeFile.txt", "r").xreadlines():
encrypted_ascii_data = gpg.encrypt(curr_line, "recipient@gmail.com")
open("EncryptedOutputFile.dat", "a").write(encrypted_ascii_data)
This sample produces an invalid output file because I cannot simply concatenate encrypted blobs together into a file.
回答1:
Encrypting line for line results in a vast number of OpenPGP documents, which will not only make decryption more complicated, but also massively blow up the file size and computation effort.
The GnuPG module for Python also knows a method encrypt_file, which takes a stream as input and knows the optional output
parameter to directly write the result to a file.
with open("VeryLargeFile.txt", "r") as infile:
gpg.encrypt_file(
infile, "recipient@example.org",
output="EncryptedOutputFile.dat")
This results in a streaming behavior with constant and low memory requirements.
回答2:
I added a "b" (for binary) to the open command and it worked great for my code. Encrypting this way for some reason is slower than half the speed of encrpyting via shell/bash command though.
来源:https://stackoverflow.com/questions/35421076/how-to-encrypt-a-large-dataset-using-python-gnupg-without-sucking-up-all-the-mem