Is it required to have predefined Image size to use transfer learning in tensorflow?

浪子不回头ぞ 提交于 2019-12-06 09:26:01

No, you do not need to resize your input images to fixed shapes yourself. Tensorflow object detection api has a prepocessing step that will resize all input images. Following is a function defined within preprocessing step and there is a image_resizer_fn, it corresponds to a field named image_resizer within the config file.

def transform_input_data(tensor_dict,
                     model_preprocess_fn,
                     image_resizer_fn,
                     num_classes,
                     data_augmentation_fn=None,
                     merge_multiple_boxes=False,
                     retain_original_image=False,
                     use_multiclass_scores=False,
                     use_bfloat16=False):


"""A single function that is responsible for all input data transformations.
  Data transformation functions are applied in the following order.
  1. If key fields.InputDataFields.image_additional_channels is present in
     tensor_dict, the additional channels will be merged into
     fields.InputDataFields.image.
  2. data_augmentation_fn (optional): applied on tensor_dict.
  3. model_preprocess_fn: applied only on image tensor in tensor_dict.
  4. image_resizer_fn: applied on original image and instance mask tensor in
     tensor_dict.
  5. one_hot_encoding: applied to classes tensor in tensor_dict.
  6. merge_multiple_boxes (optional): when groundtruth boxes are exactly the
     same they can be merged into a single box with an associated k-hot class
     label.

According to the proto file, you can choose among 4 different image resizers, namely

  1. keep_aspect_ratio_resizer
  2. fixed_shape_resizer
  3. identity_resizer
  4. conditional_shape_resizer

Here is a sample config file for model faster_rcnn_resnet101_pets and the images are all reshaped with min_dimension=600 and max_dimension=1024

model {
  faster_rcnn {
    num_classes: 37
    image_resizer {
      keep_aspect_ratio_resizer {
        min_dimension: 600
        max_dimension: 1024
      }
    }
    feature_extractor {
      type: 'faster_rcnn_resnet101'
      first_stage_features_stride: 16
    }

In fact, the shape of resized images has big influence in the detection speed vs accuracy performance. Although there is no specific requirements for the input image sizes, it is better to have all images with least dimension bigger than a reasonable value in order for the convolutional operation to work properly.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!