How to create dataset in the same format as the FSNS dataset?

前端 未结 2 729
广开言路
广开言路 2020-11-27 16:39

I\'m working on this project based on TensorFlow.

I just want to train an OCR model by attention_ocr based on my own datasets, but I don\'t know how to store my imag

相关标签:
2条回答
  • 2020-11-27 17:23

    You should not use the below code directly:

    "'image/encoded': _bytes_feature(img.tostring()),"
    

    In my code, I wrote this:

    _,jpegVector = cv2.imencode('.jpeg',img)
    imgStr = jpegVector.tostring()
    'image/encoded': _bytes_feature(imgStr)
    
    0 讨论(0)
  • 2020-11-27 17:31

    The data format for storing training/test is defined in the FSNS paper https://arxiv.org/pdf/1702.03970.pdf (Table 4).

    To store tfrecord files with tf.Example protos you can use tf.python_io.TFRecordWriter. There is a nice tutorial, an existing answer on the stackoverflow and a short gist.

    Assume you have an numpy ndarray img which has num_of_views images stored side-by-side (see Fig. 3 in the paper): enter image description here and a corresponding text in a variable text. You will need to define some function to convert a unicode string into a list of character ids padded to a fixed length and unpadded as well. For example:

    char_ids_padded, char_ids_unpadded = encode_utf8_string(
       text='abc', 
       charset={'a':0, 'b':1, 'c':2},
       length=5,
       null_char_id=3)
    

    the result should be:

    char_ids_padded = [0,1,2,3,3]
    char_ids_unpadded = [0,1,2]
    

    If you use functions _int64_feature and _bytes_feature defined in the gist you can create a FSNS compatible tf.Example proto using a following snippet:

    char_ids_padded, char_ids_unpadded = encode_utf8_string(
       text, charset, length, null_char_id)
    example = tf.train.Example(features=tf.train.Features(
      feature={
        'image/format': _bytes_feature("PNG"),
        'image/encoded': _bytes_feature(img.tostring()),
        'image/class': _int64_feature(char_ids_padded),
        'image/unpadded_class': _int64_feature(char_ids_unpadded),
        'height': _int64_feature(img.shape[0]),
        'width': _int64_feature(img.shape[1]),
        'orig_width': _int64_feature(img.shape[1]/num_of_views),
        'image/text': _bytes_feature(text)
      }
    ))
    
    0 讨论(0)
提交回复
热议问题