I'm reading images into my TF network, but I also need the associated labels along with them.
So I tried to follow this answer, but the labels that are output don't actually match the images that I'm getting in every batch.
The names of my images are in the format dir/3.jpg
, so I just extract the label from the image file name.
truth_filenames_np = ... truth_filenames_tf = tf.convert_to_tensor(truth_filenames_np) # get the labels labels = [f.rsplit("/", 1)[1] for f in truth_filenames_np] labels_tf = tf.convert_to_tensor(labels) # *** This line should make sure both input tensors are synced (from my limited understanding) # My list is also already shuffled, so I set shuffle=False truth_image_name, truth_label = tf.train.slice_input_producer([truth_filenames_tf, labels_tf], shuffle=False) truth_image_value = tf.read_file(truth_image_name) truth_image = tf.image.decode_jpeg(truth_image_value) truth_image.set_shape([IMAGE_DIM, IMAGE_DIM, 3]) truth_image = tf.cast(truth_image, tf.float32) truth_image = truth_image/255.0 # Another key step, where I batch them together truth_images_batch, truth_label_batch = tf.train.batch([truth_image, truth_label], batch_size=mb_size) with tf.Session() as sess: sess.run(tf.global_variables_initializer()) coord = tf.train.Coordinator() threads = tf.train.start_queue_runners(coord=coord) for i in range(epochs): print "Epoch ", i X_truth_batch = truth_images_batch.eval() X_label_batch = truth_label_batch.eval() # Here I display all the images in this batch, and then I check which file numbers they actually are. # BUT, the images that are displayed don't correspond with what is printed by X_label_batch! print X_label_batch plot_batch(X_truth_batch) coord.request_stop() coord.join(threads)
Am I doing something wrong, or does the slice_input_producer not actually ensure that its input tensors are synced?
Aside:
I also noticed that when I get a batch from tf.train.batch, the elements in the batch are adjacent to each other in the original list I gave it, but the batch order isn't in the original order. Example: If my data is ["dir/1.jpg", "dir/2.jpg", "dir/3.jpg", "dir/4.jpg", "dir/5.jpg, "dir/6.jpg"], then I may get the batch (with batch_size=2) ["dir/3.jpg", "dir/4.jpg"], then batch ["dir/1.jpg", "dir/2.jpg"], and then the last one. So this makes it hard to even just use a FIFO queue for the labels since the order won't match the batch order.
Here is a complete runnable example that reproduces the problem:
import tensorflow as tf truth_filenames_np = ['dir/%d.jpg' % j for j in range(66)] truth_filenames_tf = tf.convert_to_tensor(truth_filenames_np) # get the labels labels = [f.rsplit("/", 1)[1] for f in truth_filenames_np] labels_tf = tf.convert_to_tensor(labels) # My list is also already shuffled, so I set shuffle=False truth_image_name, truth_label = tf.train.slice_input_producer( [truth_filenames_tf, labels_tf], shuffle=False) # # Another key step, where I batch them together # truth_images_batch, truth_label_batch = tf.train.batch( # [truth_image_name, truth_label], batch_size=11) epochs = 7 with tf.Session() as sess: sess.run(tf.global_variables_initializer()) coord = tf.train.Coordinator() threads = tf.train.start_queue_runners(coord=coord) for i in range(epochs): print("Epoch ", i) X_truth_batch = truth_image_name.eval() X_label_batch = truth_label.eval() # Here I display all the images in this batch, and then I check # which file numbers they actually are. # BUT, the images that are displayed don't correspond with what is # printed by X_label_batch! print(X_truth_batch) print(X_label_batch) coord.request_stop() coord.join(threads)
What this prints is:
Epoch 0 b'dir/0.jpg' b'1.jpg' Epoch 1 b'dir/2.jpg' b'3.jpg' Epoch 2 b'dir/4.jpg' b'5.jpg' Epoch 3 b'dir/6.jpg' b'7.jpg' Epoch 4 b'dir/8.jpg' b'9.jpg' Epoch 5 b'dir/10.jpg' b'11.jpg' Epoch 6 b'dir/12.jpg' b'13.jpg'
So basically each eval call runs the operation another time ! Adding the batching does not make a difference to that - just prints batches (the first 11 filenames followed by the next 11 labels and so on)
The workaround I see is:
for i in range(epochs): print("Epoch ", i) pair = tf.convert_to_tensor([truth_image_name, truth_label]).eval() print(pair[0]) print(pair[1])
which correctly prints:
Epoch 0 b'dir/0.jpg' b'0.jpg' Epoch 1 b'dir/1.jpg' b'1.jpg' # ...
but does nothing for the violation of the principle of the least surprise.
EDIT: yet another way of doing it:
import tensorflow as tf truth_filenames_np = ['dir/%d.jpg' % j for j in range(66)] truth_filenames_tf = tf.convert_to_tensor(truth_filenames_np) labels = [f.rsplit("/", 1)[1] for f in truth_filenames_np] labels_tf = tf.convert_to_tensor(labels) truth_image_name, truth_label = tf.train.slice_input_producer( [truth_filenames_tf, labels_tf], shuffle=False) epochs = 7 with tf.Session() as sess: sess.run(tf.global_variables_initializer()) tf.train.start_queue_runners(sess=sess) for i in range(epochs): print("Epoch ", i) X_truth_batch, X_label_batch = sess.run( [truth_image_name, truth_label]) print(X_truth_batch) print(X_label_batch)
That's a much better way as tf.convert_to_tensor
and co only accept tensors of same type/shape etc.
Note that I removed the coordinator for simplicity, which however results in a warning:
W c:\tf_jenkins\home\workspace\release-win\device\cpu\os\windows\tensorflow\core\kernels\queue_base.cc:294] _0_input_producer/input_producer/fraction_of_32_full/fraction_of_32_full: Skipping cancelled enqueue attempt with queue not closed
See this