How to put many numpy files in one big numpy file without having memory error?

帅比萌擦擦* 提交于 2021-02-11 15:57:06

问题


I follow this question Append multiple numpy files to one big numpy file in python in order to put many numpy files in one big file, the result is:

import matplotlib.pyplot as plt 
import numpy as np
import glob
import os, sys
fpath ="path_Of_my_final_Big_File"
npyfilespath ="path_of_my_numpy_files"   
os.chdir(npyfilespath)
npfiles= glob.glob("*.npy")
npfiles.sort()
all_arrays = np.zeros((166601,8000))
for i,npfile in enumerate(npfiles):
    all_arrays[i]=np.load(os.path.join(npyfilespath, npfile))
np.save(fpath, all_arrays)
data = np.load(fpath)
print data
print data.shape

I have thousands of files, by using this code, I have always a memory error, so I can't have my result file. How to resolve this error? How to read, write and append int the final numpy file by file, ?


回答1:


Try to have a look to np.memmap. You can instantiateall_arrays:

all_arrays = np.memmap("all_arrays.dat", dtype='float64', mode='w+', shape=(166601,8000))

from the documentation:

Memory-mapped files are used for accessing small segments of large files on disk, without reading the entire file into memory.

You will be able to access all the array, but the operating system will take care of loading the part that you actually need. Read carefully the documentation page and note that from the performance point of view you can decide whether the file should be stored column-wise or row-wise.



来源:https://stackoverflow.com/questions/42385334/how-to-put-many-numpy-files-in-one-big-numpy-file-without-having-memory-error

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!