tf.data vs keras.utils.sequence performance

后端 未结 1 621
北荒
北荒 2021-02-08 20:34

I\'m trying to decide whether to use the existing keras.utils.sequence module or to switch to tf.data. From what I understand, tf.data optimizes performance by overlapping train

相关标签:
1条回答
  • 2021-02-08 21:15

    Both approaches overlap input data preprocessing with model training. keras.utils.sequence does this by running multiple Python processes, while tf.data does this by running multiple C++ threads.

    If your preprocessing is being done by a non-TensorFlow Python library such as PIL, keras.utils.sequence may work better for you since multiple processes are needed to avoid contention on Python's global interpreter lock.

    If you can express your preprocessing using TensorFlow operations, I would expect tf.data to give better performance.

    Some other things to consider:

    • tf.data is the recommended approach for building scalable input pipelines for tf.keras
    • tf.data is used more widely than keras.utils.sequence, so it may be easier to search for help with getting good performance.
    0 讨论(0)
提交回复
热议问题