Updating Tensorflow Object detection model with new images

大城市里の小女人 提交于 2019-12-21 11:57:38

问题


I have trained a faster rcnn model with a custom dataset using Tensorflow's Object Detection Api. Over time I would like to continue to update the model with additional images (collected weekly). The goal is to optimize for accuracy and to weight newer images over time.

Here are a few alternatives:

  1. Add images to previous dataset and train a completely new model
  2. Add images to previous dataset and continue training previous model
  3. New dataset with just new images and continue training previous model

Here are my thoughts: option 1: would be more time consuming, but all images would be treated "equally".

Option 2: would like take less additional training time, but one concern is that the algorithm might be weighting the earlier images more.

Option 3: This seems like the best option. Take original model and simply focus on training the new stuff.

Is one of these clearly better? What would be the pros/cons of each?

In addition, I'd like to know if it's better to keep one test set as a control for accuracy or to create a new one each time that includes newer images. Perhaps adding some portion of new images to model and another to the test set, and then feeding older test set images back into model (or throwing them out)?


回答1:


Consider the case where your dataset is nearly perfect. If you ran the model on new images (collected weekly), then the results (i.e. boxes with scores) would be exactly what you want from the model and it would be pointless adding these to the dataset because the model would not be learning anything new.

For the imperfect dataset, results from new images will show (some) errors and these are appropriate for further training. But there may be "bad" images already in the dataset and it is desirable to remove these. This indicates that Option 1 must occur, on some schedule, to remove entirely the effect of "bad" images.

On a shorter schedule, Option 3 is appropriate if the new images are reasonably balanced across the domain categories (in some sense a representative subset of the previous dataset).

Option 2 seems pretty safe and is easier to understand. When you say "the algorithm might be weighting the earlier images more", I don't see why this is a problem if the earlier images are "good". However, I can see that the domain may change over time (evolution) in which case you may well wish to counter-weight older images. I understand that you can modify the training data to do just that as discussed in this question:

Class weights for balancing data in TensorFlow Object Detection API



来源:https://stackoverflow.com/questions/51884713/updating-tensorflow-object-detection-model-with-new-images

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!