Tensorflow TFRecord: Can't parse serialized example

吃可爱长大的小学妹 提交于 2020-02-20 03:25:19

问题


I am trying to follow this guide in order to serialize my input data into the TFRecord format but I keep hitting this error when trying to read it:

InvalidArgumentError: Key: my_key. Can't parse serialized Example.

I am not sure where I'm going wrong. Here is a minimal reproduction of the issue I cannot get past.

Serialise some sample data:

with tf.python_io.TFRecordWriter('train.tfrecords') as writer:
  for idx in range(10):
        example = tf.train.Example(
            features=tf.train.Features(
                feature={
                    'label': tf.train.Feature(int64_list=tf.train.Int64List(value=[1,2,3])),
                    'test': tf.train.Feature(float_list=tf.train.FloatList(value=[0.1,0.2,0.3])) 
                }
            )
        )

        writer.write(example.SerializeToString())
  writer.close()

Parsing function & deserialise:

def parse(tfrecord):
  features = {
      'label': tf.FixedLenFeature([], tf.int64, default_value=0),
      'test': tf.FixedLenFeature([], tf.float32, default_value=0.0),
  }
  return tf.parse_single_example(tfrecord, features)

dataset = tf.data.TFRecordDataset('train.tfrecords').map(parse)
getnext = dataset.make_one_shot_iterator().get_next()

When trying to run this:

with tf.Session() as sess:
  v = sess.run(getnext)
  print (v)

I trigger the above error message.

Is it possible to get past this error and deserialize my data?


回答1:


tf.FixedLenFeature() is used for reading the fixed size arrays of data. And the shape of the data should be defined beforehand. Updating the parse function to

def parse(tfrecord):
   return tf.parse_single_example(tfrecord, features={
       'label': tf.FixedLenFeature([3], tf.int64, default_value=[0,0,0]),
       'test': tf.FixedLenFeature([3], tf.float32, default_value=[0.0, 0.0, 0.0]),
   })

Should do the job.




回答2:


As an alternative, if your input features lengths are not fixed and are of arbitrary sizes then you can also use tf.io.FixedLenSequenceFeature() with arguments allow_missing = True and default_value=0 (in case of type int and 0.0 for float) which does not require the input feature to be of fixed size unlike tf.io.FixedLenFeature(). You can find more information here.



来源:https://stackoverflow.com/questions/53499409/tensorflow-tfrecord-cant-parse-serialized-example

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!