I have a directory of images, and a separate file matching image filenames to labels. So the directory of images has files like \'train/001.jpg\' and the labeling file looks lik
Given that your data is not too large for you to supply the list of filenames as a python array, I'd suggest just doing the preprocessing in Python. Create two lists (same order) of the filenames and the labels, and insert those into either a randomshufflequeue or a queue, and dequeue from that. If you want the "loops infinitely" behavior of the string_input_producer, you could re-run the 'enqueue' at the start of every epoch.
A very toy example:
import tensorflow as tf
f = ["f1", "f2", "f3", "f4", "f5", "f6", "f7", "f8"]
l = ["l1", "l2", "l3", "l4", "l5", "l6", "l7", "l8"]
fv = tf.constant(f)
lv = tf.constant(l)
rsq = tf.RandomShuffleQueue(10, 0, [tf.string, tf.string], shapes=[[],[]])
do_enqueues = rsq.enqueue_many([fv, lv])
gotf, gotl = rsq.dequeue()
with tf.Session() as sess:
sess.run(tf.initialize_all_variables())
tf.train.start_queue_runners(sess=sess)
sess.run(do_enqueues)
for i in xrange(2):
one_f, one_l = sess.run([gotf, gotl])
print "F: ", one_f, "L: ", one_l
The key is that you're effectively enqueueing pairs of filenames/labels when you do the enqueue
, and those pairs are returned by the dequeue
.