Tensorflow Data API - prefetch

后端 未结 1 824
攒了一身酷
攒了一身酷 2021-02-04 04:17

I am trying to use new features of TF, namely Data API, and I am not sure how prefetch works. In the code below

def dataset_input_fn(...)
    da         


        
相关标签:
1条回答
  • 2021-02-04 04:20

    In discussion on github I found a comment by mrry:

    Note that in TF 1.4 there will be a Dataset.prefetch() method that makes it easier to add prefetching at any point in the pipeline, not just after a map(). (You can try it by downloading the current nightly build.)

    and

    For example, Dataset.prefetch() will start a background thread to populate a ordered buffer that acts like a tf.FIFOQueue, so that downstream pipeline stages need not block. However, the prefetch() implementation is much simpler, because it doesn't need to support as many different concurrent operations as a tf.FIFOQueue.

    so it means prefetch could be put by any command and it works on the previous command. So far I have noticed the biggest performance gains by putting it only at the very end.

    There is one more discussion on Meaning of buffer_size in Dataset.map , Dataset.prefetch and Dataset.shuffle where mrry explains a bit more about the prefetch and buffer.

    UPDATE 2018/10/01:

    From version 1.7.0 Dataset API (in contrib) has an option to prefetch_to_device. Note that this transformation has to be the last in the pipeline and when TF 2.0 arrives contrib will be gone. To have prefetch work on multiple GPUs please use MultiDeviceIterator (example see #13610) multi_device_iterator_ops.py.

    https://www.tensorflow.org/versions/master/api_docs/python/tf/contrib/data/prefetch_to_device

    0 讨论(0)
提交回复
热议问题