问题
It seems that tf.data.Dataset
provides a more flexible and more sophisticated alternative to TF queues (subclasses of QueueBase
).
(E.g. a TF queue cannot really be reopened after it was closed, see here, here.)
(There also seems to be some downsides with Dataset
, like that it runs (mostly) on CPU.)
I liked the FIFOQueue
. Is there some equivalent Dataset
?
More specifically, I have one (or multiple) background thread which would get data from somewhere (might not be TF related), and this data could be pushed to some queue (maybe with some optional TF processing in between). On the other side, I would like to have a Dataset
where the iterator would yield these queued elements.
This would be very similar to what you would get with FIFOQueue
. However, in addition to that, you can handle the end of the dataset in a more flexible way. And you can also re-initialize the iterator. This specifically would be important for my use case, which is why FIFOQueue
is not a good option for me.
Actually on the consumer side, my code would look similar to one of the official TF examples (via):
iterator = Iterator.from_structure(tf.int64, tf.TensorShape([]))
train_initializer = iterator.make_initializer(train_dataset)
dev_initializer = iterator.make_initializer(dev_dataset)
prediction, loss = model_fn(iterator.get_next())
for epoch in range(num_epochs):
train_dataset.custom_epoch_init(epoch) # whatever ...
sess.run(train_initializer)
while True:
try:
pred, loss_val = sess.run([prediction, loss])
except tf.errors.OutOfRangeError:
break
sess.run(dev_initializer)
while True:
try:
pred, loss_val = sess.run([prediction, loss])
except tf.errors.OutOfRangeError:
break
How would I implement train_dataset
and dev_dataset
?
I guess I get some queue-like behavior via PrefetchDataset
. But how would the enqueue logic look like? Maybe via _GeneratorDataset
?
This seems to be a related feature request.
来源:https://stackoverflow.com/questions/61964754/is-there-a-queue-like-dataset