Resizing images in Keras ImageDataGenerator flow methods

前端 未结 4 1554
攒了一身酷
攒了一身酷 2020-12-16 10:56

The Keras ImageDataGenerator class provides the two flow methods flow(X, y) and flow_from_directory(directory) (https://keras.io/prepr

相关标签:
4条回答
  • 2020-12-16 11:14
    X_data_resized = [skimage.transform.resize(image, new_shape) for image in X_data]
    

    because of the above code is now depreciated...

    0 讨论(0)
  • 2020-12-16 11:19

    For large training dataset, performing transformations such as resizing on the entire training data is very memory consuming. As Keras did in ImageDataGenerator, it's better to do it batch by batch. As far as I know, there're 2 ways to achieve this other than operating the whole dataset:

    1. You can use Lambda Layer to create a layer and then feed original training data to it. The output is the resized you need.

    Here is the sample code if you use TensorFlow as the backend of Keras:

    original_dim = (32, 32, 3)
    target_size = (64, 64)
    input = keras.layers.Input(original_dim)
    x = tf.keras.layers.Lambda(lambda image: tf.image.resize(image, target_size))(input)
    
    1. As @Retardust mentioned, maybe you can customize your own ImageDataGenerator as well as the preprocessing_function.
    0 讨论(0)
  • 2020-12-16 11:19

    For anyone else who wants to do this, .flow method of ImageDataGenerator does not have a target_shape parameter and we cannot resize an image using preprocessing_function parameter as the documentation states The function will run after the image is resized and augmented. The function should take one argument: one image (Numpy tensor with rank 3), and should output a Numpy tensor with the same shape. So in order to use .flow, you will have to pass resized images only otherwise use a custom generator that resizes them on the fly.

    Here's a sample of custom generator in keras (can also be made using python generator or any other method)

    class Custom_Generator(keras.utils.Sequence) :
        def __init__(self,...,datapath, batch_size, ..) :
    
        def __len__(self) :
            #calculate data len, something like len(train_labels)
    
    
        def load_and_preprocess_function(self, label_names, ...):
            #do something...
            #load data for the batch using label names with whatever library
    
        def __getitem__(self, idx) :
            batch_y = train_labels[idx:idx+batch_size]
            batch_x = self.load_and_preprocess_function()
            return ( batch_x, batch_y )
    
    0 讨论(0)
  • 2020-12-16 11:21

    flow_from_directory(directory) generates augmented images from directory with arbitrary collection of images. So there is need of parameter target_size to make all images of same shape.

    While flow(X, y) augments images which are already stored in a sequence in X which is nothing but numpy matrix and can be easily preprocessed/resized before passing to flow. So no need for target_size parameter. As for resizing I prefer using scipy.misc.imresize over PIL.Image resize, or cv2.resize as it can operate on numpy image data.

    import scipy
    new_shape = (28,28,3)
    X_train_new = np.empty(shape=(X_train.shape[0],)+new_shape)
    for idx in xrange(X_train.shape[0]):
        X_train_new[idx] = scipy.misc.imresize(X_train[idx], new_shape)
    
    0 讨论(0)
提交回复
热议问题