TF slice_input_producer not keeping tensors in sync

匿名 (未验证) 提交于 2019-12-03 02:03:01

问题:

I'm reading images into my TF network, but I also need the associated labels along with them.

So I tried to follow this answer, but the labels that are output don't actually match the images that I'm getting in every batch.

The names of my images are in the format dir/3.jpg, so I just extract the label from the image file name.

truth_filenames_np = ... truth_filenames_tf = tf.convert_to_tensor(truth_filenames_np)  # get the labels labels = [f.rsplit("/", 1)[1] for f in truth_filenames_np]  labels_tf = tf.convert_to_tensor(labels)  # *** This line should make sure both input tensors are synced (from my limited understanding) # My list is also already shuffled, so I set shuffle=False truth_image_name, truth_label = tf.train.slice_input_producer([truth_filenames_tf, labels_tf], shuffle=False)   truth_image_value = tf.read_file(truth_image_name) truth_image = tf.image.decode_jpeg(truth_image_value) truth_image.set_shape([IMAGE_DIM, IMAGE_DIM, 3]) truth_image = tf.cast(truth_image, tf.float32) truth_image = truth_image/255.0  # Another key step, where I batch them together truth_images_batch, truth_label_batch = tf.train.batch([truth_image, truth_label], batch_size=mb_size)   with tf.Session() as sess:     sess.run(tf.global_variables_initializer())      coord = tf.train.Coordinator()     threads = tf.train.start_queue_runners(coord=coord)      for i in range(epochs):         print "Epoch ", i         X_truth_batch = truth_images_batch.eval()         X_label_batch = truth_label_batch.eval()          # Here I display all the images in this batch, and then I check which file numbers they actually are.          # BUT, the images that are displayed don't correspond with what is printed by X_label_batch!         print X_label_batch         plot_batch(X_truth_batch)        coord.request_stop()     coord.join(threads) 

Am I doing something wrong, or does the slice_input_producer not actually ensure that its input tensors are synced?

Aside:

I also noticed that when I get a batch from tf.train.batch, the elements in the batch are adjacent to each other in the original list I gave it, but the batch order isn't in the original order. Example: If my data is ["dir/1.jpg", "dir/2.jpg", "dir/3.jpg", "dir/4.jpg", "dir/5.jpg, "dir/6.jpg"], then I may get the batch (with batch_size=2) ["dir/3.jpg", "dir/4.jpg"], then batch ["dir/1.jpg", "dir/2.jpg"], and then the last one. So this makes it hard to even just use a FIFO queue for the labels since the order won't match the batch order.

回答1:

Here is a complete runnable example that reproduces the problem:

import tensorflow as tf  truth_filenames_np = ['dir/%d.jpg' % j for j in range(66)] truth_filenames_tf = tf.convert_to_tensor(truth_filenames_np) # get the labels labels = [f.rsplit("/", 1)[1] for f in truth_filenames_np] labels_tf = tf.convert_to_tensor(labels)  # My list is also already shuffled, so I set shuffle=False truth_image_name, truth_label = tf.train.slice_input_producer(     [truth_filenames_tf, labels_tf], shuffle=False)  # # Another key step, where I batch them together # truth_images_batch, truth_label_batch = tf.train.batch( #     [truth_image_name, truth_label], batch_size=11)  epochs = 7  with tf.Session() as sess:     sess.run(tf.global_variables_initializer())     coord = tf.train.Coordinator()     threads = tf.train.start_queue_runners(coord=coord)     for i in range(epochs):         print("Epoch ", i)         X_truth_batch = truth_image_name.eval()         X_label_batch = truth_label.eval()         # Here I display all the images in this batch, and then I check         # which file numbers they actually are.         # BUT, the images that are displayed don't correspond with what is         # printed by X_label_batch!         print(X_truth_batch)         print(X_label_batch)     coord.request_stop()     coord.join(threads) 

What this prints is:

Epoch  0 b'dir/0.jpg' b'1.jpg' Epoch  1 b'dir/2.jpg' b'3.jpg' Epoch  2 b'dir/4.jpg' b'5.jpg' Epoch  3 b'dir/6.jpg' b'7.jpg' Epoch  4 b'dir/8.jpg' b'9.jpg' Epoch  5 b'dir/10.jpg' b'11.jpg' Epoch  6 b'dir/12.jpg' b'13.jpg' 

So basically each eval call runs the operation another time ! Adding the batching does not make a difference to that - just prints batches (the first 11 filenames followed by the next 11 labels and so on)

The workaround I see is:

for i in range(epochs):     print("Epoch ", i)     pair = tf.convert_to_tensor([truth_image_name, truth_label]).eval()     print(pair[0])     print(pair[1]) 

which correctly prints:

Epoch  0 b'dir/0.jpg' b'0.jpg' Epoch  1 b'dir/1.jpg' b'1.jpg' # ... 

but does nothing for the violation of the principle of the least surprise.

EDIT: yet another way of doing it:

import tensorflow as tf  truth_filenames_np = ['dir/%d.jpg' % j for j in range(66)] truth_filenames_tf = tf.convert_to_tensor(truth_filenames_np) labels = [f.rsplit("/", 1)[1] for f in truth_filenames_np] labels_tf = tf.convert_to_tensor(labels) truth_image_name, truth_label = tf.train.slice_input_producer(     [truth_filenames_tf, labels_tf], shuffle=False) epochs = 7 with tf.Session() as sess:     sess.run(tf.global_variables_initializer())     tf.train.start_queue_runners(sess=sess)     for i in range(epochs):         print("Epoch ", i)         X_truth_batch, X_label_batch = sess.run(             [truth_image_name, truth_label])         print(X_truth_batch)         print(X_label_batch) 

That's a much better way as tf.convert_to_tensor and co only accept tensors of same type/shape etc.

Note that I removed the coordinator for simplicity, which however results in a warning:

W c:\tf_jenkins\home\workspace\release-win\device\cpu\os\windows\tensorflow\core\kernels\queue_base.cc:294] _0_input_producer/input_producer/fraction_of_32_full/fraction_of_32_full: Skipping cancelled enqueue attempt with queue not closed

See this



标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!