What the impact of different dimension of image resizer when using default config of object detection api

前端 未结 1 1218
攒了一身酷
攒了一身酷 2021-02-04 11:02

I was trying to use the object detection API of Tensorflow to train a model. And I was using the sample config of faster rcnn resnet101 (https://github.com/tensorflow/models/blo

1条回答
  •  鱼传尺愫
    2021-02-04 11:52

    After some tests, I guess I find the answer. Please correct me if there is anything wrong.

    In .config file:

    image_resizer {
      keep_aspect_ratio_resizer {
        min_dimension: 600
        max_dimension: 1024
      }
    }
    

    According to the image resizer setting of 'object_detection/builders/image_resizer_builder.py'

    if image_resizer_config.WhichOneof(
        'image_resizer_oneof') == 'keep_aspect_ratio_resizer':
      keep_aspect_ratio_config = image_resizer_config.keep_aspect_ratio_resizer
      if not (keep_aspect_ratio_config.min_dimension
              <= keep_aspect_ratio_config.max_dimension):
        raise ValueError('min_dimension > max_dimension')
      return functools.partial(
          preprocessor.resize_to_range,
          min_dimension=keep_aspect_ratio_config.min_dimension,
          max_dimension=keep_aspect_ratio_config.max_dimension)
    

    Then it tries to use 'resize_to_range' function of 'object_detection/core/preprocessor.py'

      with tf.name_scope('ResizeToRange', values=[image, min_dimension]):
        image_shape = tf.shape(image)
        orig_height = tf.to_float(image_shape[0])
        orig_width = tf.to_float(image_shape[1])
        orig_min_dim = tf.minimum(orig_height, orig_width)
    
        # Calculates the larger of the possible sizes
        min_dimension = tf.constant(min_dimension, dtype=tf.float32)
        large_scale_factor = min_dimension / orig_min_dim
        # Scaling orig_(height|width) by large_scale_factor will make the smaller
        # dimension equal to min_dimension, save for floating point rounding errors.
        # For reasonably-sized images, taking the nearest integer will reliably
        # eliminate this error.
        large_height = tf.to_int32(tf.round(orig_height * large_scale_factor))
        large_width = tf.to_int32(tf.round(orig_width * large_scale_factor))
        large_size = tf.stack([large_height, large_width])
    
        if max_dimension:
          # Calculates the smaller of the possible sizes, use that if the larger
          # is too big.
          orig_max_dim = tf.maximum(orig_height, orig_width)
          max_dimension = tf.constant(max_dimension, dtype=tf.float32)
          small_scale_factor = max_dimension / orig_max_dim
          # Scaling orig_(height|width) by small_scale_factor will make the larger
          # dimension equal to max_dimension, save for floating point rounding
          # errors. For reasonably-sized images, taking the nearest integer will
          # reliably eliminate this error.
          small_height = tf.to_int32(tf.round(orig_height * small_scale_factor))
          small_width = tf.to_int32(tf.round(orig_width * small_scale_factor))
          small_size = tf.stack([small_height, small_width])
    
          new_size = tf.cond(
              tf.to_float(tf.reduce_max(large_size)) > max_dimension,
              lambda: small_size, lambda: large_size)
        else:
          new_size = large_size
    
        new_image = tf.image.resize_images(image, new_size,
                                           align_corners=align_corners)
    

    From the above code, we can know if we have an image whose size is 800*1000. The size of final output image will be 600*750.

    That is, this image resizer will always resize your input image according to the setting of 'min_dimension' and 'max_dimension'.

    0 讨论(0)
提交回复
热议问题