Training Image Size Faster-RCNN

问题

I will train my dataset with faster-rcnn for one class. All my images are 1920x1080 sizes. Should I resize or crop the images or I can train with this size? Also my objects are really small (around 60x60).

In the config file there are dimensions written as min_dimension: 600 and max_dimension: 1024 for this reason I am confused to train the model with 1920x1080 size images.

回答1:

If your objects are small, resizing the images to a smaller size is not a good idea. You can change the max_dimension to 1920 or 2000 which might make the speed a bit lower. For cropping the images, you should first consider how the objects are placed in the images. If cropping will cut a lot of objects, then you will have many cases of truncation which might have a negative effect on the model's performance.

回答2:

If you insist on the faster-rcnn to cope with this task, personally I recommend:

Change input height and width, maximum and minimum value in the config file, which should work for your dataset in terms of successfully execution.
Change the original region proposal parameters (should be in config file, too) to certain ratio and scale like 1:1 and 60.

But if I were you, I would like to try:

Add some shortcuts in backbone since it is a small object detection task which is in need of features of high resolution.
Cut the fast-rcnn head off to enhance the performance, since I only need to detect one class to be THE class or not to be (being background or other class), and the output should be enough to encode the information at the RPN stage.

来源：https://stackoverflow.com/questions/57855824/training-image-size-faster-rcnn

标签

tensorflow

object-detection-api

image-size

faster-rcnn