Binary Image Classification with CNN - best practices for choosing “negative” dataset? [closed]

前端未结

关注

 2  740

失恋的感觉

相关标签:

2条回答

无人共我

2021-01-14 12:21

Like in all of supervised machine learning, the training set should reflect the real distribution that the model is going to work with. Neural network is basically a function approximator. Your actual goal is to approximate the real-world distribution, but in practice it's only possible to get the sample from it, and this sample is the only thing a neural network will see. For any input way outside of the training manifold, the output will be a just a guess (see also this discussion on AI.SE).

So when choosing a negative dataset, the first question you should answer is: What will be the likely use-case of this model? E.g., if you're building an app for a smartphone, then the negative sample should probably include street views, pictures of buildings and stores, people, indoor environment, etc. It's unlikely that the image from the smartphone camera will be a wild animal or abstract painting, i.e., it's an improbable input in your real distribution.

Including images that look like a positive class (trucks, airplanes, boats, etc) is a good idea, because the low-conv-layer features (edges, corners) will be very similar and it's important that the neural network learned important high-level features correctly.

In general, I'd use 5-10x more negative images that positive ones. CIFAR-10 is a good starting point: out of 50000 training images 5000 are the cars, 5000 are the planes, etc. In fact, building a 10-class classifier is not a bad idea. In this case, you'll transform this CNN to a binary classifier by thresholding its certainty that the inferred class is a car. Anything that the CNN isn't certain about will be interpreted as not a car.

0 讨论(0)
发布评论:

提交评论
- 加载中...
终归单人心

2021-01-14 12:35

I think the negative sample should be selected depend on the occasion your model works on. If your model works on the street as a car detector, the reasonable negative sample should be street road background, trees, pedestrian,and other vehicle that commone in street. So i think there is not a universal negative sample select rules but only depend on your need.

0 讨论(0)
发布评论:

提交评论
- 加载中...

热议问题