问题
I follow this question Append multiple numpy files to one big numpy file in python in order to put many numpy files in one big file, the result is:
import matplotlib.pyplot as plt
import numpy as np
import glob
import os, sys
fpath ="path_Of_my_final_Big_File"
npyfilespath ="path_of_my_numpy_files"
os.chdir(npyfilespath)
npfiles= glob.glob("*.npy")
npfiles.sort()
all_arrays = np.zeros((166601,8000))
for i,npfile in enumerate(npfiles):
all_arrays[i]=np.load(os.path.join(npyfilespath, npfile))
np.save(fpath, all_arrays)
data = np.load(fpath)
print data
print data.shape
I have thousands of files, by using this code, I have always a memory error, so I can't have my result file. How to resolve this error? How to read, write and append int the final numpy file by file, ?
回答1:
Try to have a look to np.memmap. You can instantiateall_arrays
:
all_arrays = np.memmap("all_arrays.dat", dtype='float64', mode='w+', shape=(166601,8000))
from the documentation:
Memory-mapped files are used for accessing small segments of large files on disk, without reading the entire file into memory.
You will be able to access all the array, but the operating system will take care of loading the part that you actually need. Read carefully the documentation page and note that from the performance point of view you can decide whether the file should be stored column-wise or row-wise.
来源:https://stackoverflow.com/questions/42385334/how-to-put-many-numpy-files-in-one-big-numpy-file-without-having-memory-error