问题
In distributed tensorflow, I need processing input datas on one worker and consuming them on other different session. "make_initializable_iterator" have an undocumented parameter "shared_name", but how could I initialize the iterator without create the datasets on the every session.
def make_initializable_iterator(self, shared_name=None):
"""Creates an `Iterator` for enumerating the elements of this dataset.
Note: The returned iterator will be in an uninitialized state,
and you must run the `iterator.initializer` operation before using it"""
More clear, if I defined an iterator with shared_name, how to use this iterator in another session.
回答1:
The iter_init_op
might be what you are searching for:
# this's how a input pipeline usually looks like
ncores = multiprocessing.cpu_count()
dataset = tf.data.Dataset.from_tensor_slices(file_list))
dataset = dataset.map(augmentation_function, num_parallel_calls=ncores)
batch = dataset.shuffle(batch_size).batch(batch_size).prefetch(5)
# construct iterator
it = batch.make_initializable_iterator(shared_name='shared_iterator')
iter_init_op = it.initializer # you call this operation within session to initialiser
Within the session:
with tf.Session() as sess:
...
for epoch in range(nb_epoch):
# init iterator during epoch
sess.run(iter_init_op)
来源:https://stackoverflow.com/questions/55612210/how-to-use-shared-name-on-initializable-iterator