Tensorflow mixes up images and labels when making batch

筅森魡賤 提交于 2020-01-14 02:08:15

问题


So I've been stuck on this problem for weeks. I want to make an image batch from a list of image filenames. I insert the filename list into a queue and use a reader to get the file. The reader then returns the filename and the read image file.

My problem is that when I make a batch using the decoded jpg and the labels from the reader, tf.train.shuffle_batch() mixes up the images and the filenames so that now the labels are in the wrong order for the image files. Is there something I am doing wrong with the queue/shuffle_batch and how can I fix it such that the batch comes out with the right labels for the right files?

Much thanks!

import tensorflow as tf
from tensorflow.python.framework import ops


def preprocess_image_tensor(image_tf):
  image = tf.image.convert_image_dtype(image_tf, dtype=tf.float32)
  image = tf.image.resize_image_with_crop_or_pad(image, 300, 300)
  image = tf.image.per_image_standardization(image)
return image

# original image names and labels
image_paths = ["image_0.jpg", "image_1.jpg", "image_2.jpg", "image_3.jpg", "image_4.jpg", "image_5.jpg", "image_6.jpg", "image_7.jpg", "image_8.jpg"]

labels = [0, 1, 2, 3, 4, 5, 6, 7, 8]

# converting arrays to tensors
image_paths_tf = ops.convert_to_tensor(image_paths, dtype=tf.string, name="image_paths_tf")
labels_tf = ops.convert_to_tensor(labels, dtype=tf.int32, name="labels_tf")

# getting tensor slices
image_path_tf, label_tf = tf.train.slice_input_producer([image_paths_tf, labels_tf], shuffle=False)

# getting image tensors from jpeg and performing preprocessing
image_buffer_tf = tf.read_file(image_path_tf, name="image_buffer")
image_tf = tf.image.decode_jpeg(image_buffer_tf, channels=3, name="image")
image_tf = preprocess_image_tensor(image_tf)

# creating a batch of images and labels
batch_size = 5
num_threads = 4
images_batch_tf, labels_batch_tf = tf.train.batch([image_tf, label_tf], batch_size=batch_size, num_threads=num_threads)

# running testing session to check order of images and labels 
init = tf.global_variables_initializer()
with tf.Session() as sess:
  sess.run(init)

  coord = tf.train.Coordinator()
  threads = tf.train.start_queue_runners(coord=coord)

  print image_path_tf.eval()
  print label_tf.eval()

  coord.request_stop()
  coord.join(threads)

回答1:


Wait.... Isn't your tf usage a little weird?

You are basically running the graph twice by calling:

  print image_path_tf.eval()
  print label_tf.eval()

And since you are only asking for image_path_tf and label_tf, anything below this line is not even run:

image_path_tf, label_tf = tf.train.slice_input_producer([image_paths_tf, labels_tf], shuffle=False)

Maybe try this?

image_paths, labels = sess.run([images_batch_tf, labels_batch_tf])
print(image_paths)
print(labels)



回答2:


From your code I'm unsure how your labels are encoded/extracted from the jpeg images. I used to encode everything in the same file, but have since found a much more elegant solution. Assuming you can get a list of filenames, image_paths and a numpy array of labels labels, you can bind them together and operate on individual examples with tf.train.slice_input_producer then batch them together using tf.train.batch.

import tensorflow as tf
from tensorflow.python.framework import ops

shuffle = True
batch_size = 128
num_threads = 8

def get_data():
    """
    Return image_paths, labels such that label[i] corresponds to image_paths[i].

    image_paths: list of strings
    labels: list/np array of labels
    """
    raise NotImplementedError()

def preprocess_image_tensor(image_tf):
    """Preprocess a single image."""
    image = tf.image.convert_image_dtype(image_tf, dtype=tf.float32)
    image = tf.image.resize_image_with_crop_or_pad(image, 300, 300)
    image = tf.image.per_image_standardization(image)
    return image

image_paths, labels = get_data()

image_paths_tf = ops.convert_to_tensor(image_paths, dtype=tf.string, name='image_paths')
labels_tf = ops.convert_to_tensor(image_paths, dtype=tf.int32, name='labels')
image_path_tf, label_tf = tf.train.slice_input_producer([image_paths_tf, labels_tf], shuffle=shuffle)

# preprocess single image paths
image_buffer_tf = tf.read_file(image_path_tf, name='image_buffer')
image_tf = tf.image.decode_jpeg(image_buffer_tf, channels=3, name='image')
image_tf = preprocess_image_tensor(image_tf)

# batch the results
image_batch_tf, labels_batch_tf = tf.train.batch([image_tf, label_tf], batch_size=batch_size, num_threads=num_threads)


来源:https://stackoverflow.com/questions/41864333/tensorflow-mixes-up-images-and-labels-when-making-batch

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!