I have a very big file 4GB and when I try to read it my computer hangs. So I want to read it piece by piece and after processing each piece store the processed piece into an
To write a lazy function, just use yield:
def read_in_chunks(file_object, chunk_size=1024):
"""Lazy function (generator) to read a file piece by piece.
Default chunk size: 1k."""
while True:
data = file_object.read(chunk_size)
if not data:
break
yield data
with open('really_big_file.dat') as f:
for piece in read_in_chunks(f):
process_data(piece)
Another option would be to use iter and a helper function:
f = open('really_big_file.dat')
def read1k():
return f.read(1024)
for piece in iter(read1k, ''):
process_data(piece)
If the file is line-based, the file object is already a lazy generator of lines:
for line in open('really_big_file.dat'):
process_data(line)