问题
The current TensorFlow dataset interleave functionality is basically a interleaved flat-map taking as input a single dataset. Given the current API, what's the best way to interleave multiple datasets together? Say they have already been constructed and I have a list of them. I want to produce elements from them alternatively and I want to support lists with more than 2 datasets (i.e., stacked zips and interleaves would be pretty ugly).
Thanks! :)
@mrry might be able to help.
回答1:
EDIT 2: See tf.contrib.data.choose_from_datasets
. It performs deterministic dataset interleaving.
EDIT: See tf.contrib.data.sample_from_datasets
. Even though it performs random sampling I guess it can be useful.
Even though this is not "clean", it is the only workaround I came up with.
datasets = [tf.data.Dataset...]
def concat_datasets(datasets):
ds0 = tf.data.Dataset.from_tensors(datasets[0])
for ds1 in datasets[1:]:
ds0 = ds0.concatenate(tf.data.Dataset.from_tensors(ds1))
return ds0
ds = tf.data.Dataset.zip(tuple(datasets)).flat_map(
lambda *args: concat_datasets(args)
)
回答2:
Expanding user2781994 answer (with edits), here is how I implemented it:
import tensorflow as tf
ds11 = tf.data.Dataset.from_tensor_slices([1,2,3])
ds12 = tf.data.Dataset.from_tensor_slices([4,5,6])
ds13 = tf.data.Dataset.from_tensor_slices([7,8,9])
all_choices_ds = [ds11, ds12, ds13]
choice_dataset = tf.data.Dataset.range(len(all_choices_ds)).repeat()
ds14 = tf.contrib.data.choose_from_datasets(all_choices_ds, choice_dataset)
# alternatively:
# ds14 = tf.contrib.data.sample_from_datasets(all_choices_ds)
iterator = ds14.make_initializable_iterator()
next_element = iterator.get_next()
with tf.Session() as sess:
sess.run(iterator.initializer)
while True:
try:
value=sess.run(next_element)
except tf.errors.OutOfRangeError:
break
print(value)
The output is:
1
4
7
2
5
8
3
6
9
来源:https://stackoverflow.com/questions/49058913/interleaving-multiple-tensorflow-datasets-together