Tensorflow input dataset with varying size images

问题

I'm trying to train a fully convolutional neural network using input images with different sizes. I can do this by looping over the training images and creating a single numpy input at each iteration i.e.,

for image_input, label in zip(image_data, labels):
    train_input_fn = tf.estimator.inputs.numpy_input_fn(
                                         x= {"x":image_input},
                                         y=label,
                                         batch_size=1, 
                                         num_epochs=None,
                                         shuffle=False)
    fcn_classifier.train(input_fn=input_func_gen, steps=1)

However, in this way the model is saved and loaded after each step wasting huge amount of resources. I have also tried creating the whole dataset at once using generators i.e.,

def input_func_gen():
    dataset = tf.data.Dataset.from_generator(generator=generator, 
                                  output_types=(tf.float32, tf.int32))
    dataset = dataset.batch(1)
    iterator = dataset.make_one_shot_iterator()
    return iterator.get_next()

def generator():
    filenames = ['building-d-mapimage-10-gt.png', 'building-dmapimage- 
                                                   16-gt.png']
    i = 0
    while i < len(filenames):        
        features, labels = loading.read_image_data(filenames[i])
        yield features, labels
        i += 1
        if i >= len(filenames):
            i = 0

And then

 fcn_classifier.train(input_fn=input_func_gen,
                      steps=100)

However, in this way the training becomes very slow and runs out of memory after first iteration, which indicates that there is something wrong with the dataset (the training runs must faster in the first case were single inputs are used). Also the shape of the features in generator are (1, image_height, image_width,3) . However in the model I have to reshape them to 4-d tensors as

input_shape = tf.shape(input)
input = tf.reshape(input, [1, input_shape[2], input_shape[3], 3])

instead of tf.reshape(input, [1, input_shape[1], input_shape[2], 3]) , which indicates that there is something weird with the dimensions of the input? In the first case I can just use the input directly without need to reshape or anything?

回答1:

I manage to solve the problem with varying size images by changing the input_func_gen to following

def input_func_gen():
    load_path = '/path_to_images'
    data_set = 'dataset_to_use'
    image_data, labels = loading.load_image_data_grayscale(load_path,data_set)
    dataset = tf.data.Dataset.from_generator(lambda: 
                              itertools.zip_longest(image_data, labels),
                              output_types=(tf.float32, tf.int32),
                              output_shapes=(tf.TensorShape([1, None, None, 
                                             3]), tf.TensorShape([1, None])))
    dataset = dataset.repeat()
    iterator = dataset.make_one_shot_iterator()
    return iterator.get_next()

来源：https://stackoverflow.com/questions/51983716/tensorflow-input-dataset-with-varying-size-images

标签

tensorflow

input