问题
In every step of a loop I have some data which I want to be saved in the end in my hard disk.
One way:
list = []
for i in range(1e10):
list.append(numpy_array_i)
pickle.dump(list, open(self.save_path, "wb"), protocol=4)
But I worry: 1_I ran out of memory for because of the list 2_If something crashes all data will be lost. Because of this I have also thought of a way to save data in real time such as:
file = make_new_csv_or_xlsx_file()
for i in range(1e10):
file.write_in_a_new_line(numpy_array_i)
For this also I worry it may not be so fast and am not sure what the best tools might be. But probably openpyxl is a good choice.
回答1:
Writing to redis
is pretty fast. And you may read from redis
in second process and write to disk
回答2:
I'd try SQLite, as it provides permanent storage on disk (-> no data loss), and yet it's faster than writing into a file as shown in your question and provides easier data lookup in case you'd have incomplete data from previous run.
Tweaking JOURNAL_MODE
can increase the performance further: https://blog.devart.com/increasing-sqlite-performance.html
来源:https://stackoverflow.com/questions/60684887/what-is-the-fastest-way-to-record-real-time-data-in-python-with-least-memory-los