How do I create padded batches in Tensorflow for tf.train.SequenceExample data using the DataSet API?

后端 未结 4 544
余生分开走
余生分开走 2020-12-31 02:53

For training an LSTM model in Tensorflow, I have structured my data into a tf.train.SequenceExample format and stored it i

相关标签:
4条回答
  • 2020-12-31 03:23

    You need to pass a tuple of shapes. In your case you should pass

    dataset = dataset.padded_batch(4, padded_shapes=([vectorSize],[None]))
    

    or try

    dataset = dataset.padded_batch(4, padded_shapes=([None],[None]))
    

    Check this code for more details. I had to debug this method to figure out why it wasn't working for me.

    0 讨论(0)
  • 2020-12-31 03:23

    If your current Dataset object contains a tuple, you can also to specify the shape of each padded element.

    For example, I have a (same_sized_images, Labels) dataset and each label has different length but same rank.

    def process_label(resized_img, label):
        # Perfrom some tensor transformations
        # ......
    
        return resized_img, label
    
    dataset = dataset.map(process_label)
    dataset = dataset.padded_batch(batch_size, 
                                   padded_shapes=([None, None, 3], 
                                                  [None, None]))  # my label has rank 2
    
    0 讨论(0)
  • 2020-12-31 03:34

    Beware of not passing a tuple of tuples. That gives a very vague error of "cannot convert value None to type Nonetype".

    So correct: padded_shapes = ([None, None], [None])

    INCORRECT: padded_shapes = ((None, None), (None))

    0 讨论(0)
  • 2020-12-31 03:45

    You may need to get help from the dataset output shapes:

    padded_shapes = dataset.output_shapes
    
    0 讨论(0)
提交回复
热议问题