Using feed_dict is more than 5x faster than using dataset API?

后端 未结 1 1819
生来不讨喜
生来不讨喜 2020-12-24 04:06

I created a dataset in TFRecord format for testing. Every entry contains 200 columns, named C1 - C199, each being a strings list, and a label

相关标签:
1条回答
  • 2020-12-24 04:41

    There is currently (as of TensorFlow 1.9) a performance issue when using tf.data to map and batch tensors that have a large number of features with a small amount of data in each. The issue has two causes:

    1. The dataset.map(parse_tfrecord, ...) transformation will execute O(batch_size * num_columns) small operations to create a batch. By contrast, feeding a tf.placeholder() to tf.parse_example() will execute O(1) operations to create the same batch.

    2. Batching many tf.SparseTensor objects using dataset.batch() is much slower than directly creating the same tf.SparseTensor as the output of tf.parse_example().

    Improvements to both these issues are underway, and should be available in a future version of TensorFlow. In the meantime, you can improve the performance of the tf.data-based pipeline by switching the order of the dataset.map() and dataset.batch() and rewriting the dataset.map() to work on a vector of strings, like the feeding based version:

    dataset = tf.data.TFRecordDataset(data_file)
    dataset = dataset.prefetch(buffer_size=batch_size*10)
    dataset = dataset.repeat(num_epochs)
    
    # Batch first to create a vector of strings as input to the map(). 
    dataset = dataset.batch(batch_size)
    
    def parse_tfrecord_batch(record_batch):
      features = tf.parse_example(
          record_batch,
          features=tf.feature_column.make_parse_example_spec(
              columns + [
                  tf.feature_column.numeric_column(
                      'label', dtype=tf.float32, default_value=0)]))
      labels = features.pop('label')
      return features, labels
    
    # NOTE: Parallelism might not be as useful, because the individual map function now does
    # more work per invocation, but you might want to experiment with this.
    dataset = dataset.map(parse_tfrecord_batch)
    
    # Add a prefetch at the end to pipeline execution.
    dataset = dataset.prefetch(1)
    
    features, labels = dataset.make_one_shot_iterator().get_next()    
    # ...
    

    EDIT (2018/6/18): To answer your questions from the comments:

    1. Why is dataset.map(parse_tfrecord, ...) O(batch_size * num_columns), not O(batch_size)? If parsing requires enumeration of the columns, why doesn't parse_example take O(num_columns)?

    When you wrap TensorFlow code in a Dataset.map() (or other functional transformation) a constant number of extra operations per output are added to "return" values from the function and (in the case of tf.SparseTensor values) "convert" them to a standard format. When you directly pass the outputs of tf.parse_example() to the input of your model, these operations aren't added. While they are very small operations, executing so many of them can become a bottleneck. (Technically the parsing does take O(batch_size * num_columns) time, but the constants involved in parsing are much smaller than executing an operation.)

    1. Why do you add a prefetch at the end of the pipeline?

    When you're interested in performance, this is almost always the best thing to do, and it should improve the overall performance of your pipeline. For more information about best practices, see the performance guide for tf.data.

    0 讨论(0)
提交回复
热议问题