问题
I want to load more than 10000 images in my 8gb ram in the form of numpy arrays.So far I have tried cv2.imread,keras.preprocessing.image.load_image,pil,imageio,scipy.I want to do it the fastest way possible but I can't figure out which on is it.
回答1:
One of the Fastest way is to get your multiprocessors do your job in Parallel it asks for parallelisation of your desired job, it brings multiple processors to work on your tasks at the same time when concurrent running isn't an issue. This parallel processing enables you for a rapid things. Now the example below is just a simple sketch out of how it might look, you can practice with small functions and then integrate it with your own code :
from multiprocessing import Process
#this is the function to be parallelised
def image_load_here(image_path):
pass
if __name__ == '__main__':
#Start the multiprocesses and provide your dataset.
p = Process(target=image_load_here,['img1', 'img2', 'img3', 'img4'])
p.start()
p.join()
Feel free to write, ill try to help.
回答2:
If you're using keras
library in order to create a deep learning
model, I suggest you to use image
class from keras.preprocessing
package.
image
class provides a method img_to_array
which returns already a numpy
array.
Also, it uses NumPy - Numpy
internally for all its array
manipulations/computations.
train_image = image.load_img(path, target_size = (height, width))
train_image = image.img_to_array(train_image)
回答3:
import numpy as np
import os
from keras.preprocessing import image
def batch_data_generator(data, indexes):
#indexes is a sub array of index from the data
X = np.zeros((len(indexes), config.IMG_INPUT_SHAPE[0], config.IMG_INPUT_SHAPE[1], config.IMG_INPUT_SHAPE[2]))
Y = np.zeros((len(indexes), len(label_mapping)))
i = 0
for idx in indexes:
image_id = data['X'][idx]
filename = os.path.join('images', str(image_id) + '.jpg')
img = image.load_img(filename, target_size=(300, 300))
X[i] = np.array(img, dtype='float32')
label_id = label_mapping[data['Y'][idx]]
Y[i][label_id] = 1
i += 1
# subtract mean and normalize
for depth in range(3):
X[:, :, :, depth] = (X[:, :, :, depth] - np.mean(X[:, :, :, depth])) / 255
return X, Y
来源:https://stackoverflow.com/questions/50233954/fastest-way-to-load-images-in-python-for-processing