Data augmentation in test/validation set?

前端 未结 7 1526
甜味超标
甜味超标 2021-02-12 14:57

It is common practice to augment data (add samples programmatically, such as random crops, etc. in the case of a dataset consisting of images) on both training and test set, or

7条回答
  •  一向
    一向 (楼主)
    2021-02-12 15:24

    Do it only on the training set. And, of course, make sure that the augmentation does not make the label wrong (e.g. when rotating 6 and 9 by about 180°).

    The reason why we use a training and a test set in the first place is that we want to estimate the error our system will have in reality. So the data for the test set should be as close to real data as possible.

    If you do it on the test set, you might have the problem that you introduce errors. For example, say you want to recognize digits and you augment by rotating. Then a 6 might look like a 9. But not all examples are that easy. Better be save than sorry.

提交回复
热议问题