问题
I am working with grayscale images dataset. Is there a way to determine a new grayscale image can contribute to the diversity of a greyscale images dataset? I would like to prevent the dataset of having too many similar samples.
回答1:
Well, what do you see when you look at it? If you have information about the images in this dataset, you yourself can probably assess whether this new sample is a repetition of some pattern that is already included in the dataset, or if it is something unique.
Another idea might be to compare the images analytically. Depending on the case, you may want to look at the individual pixel averages (each should be between 0 and 255) of your training set and compare it with the pixel values of this sample image. Similarly, other measures may also work.
What I would do is, if you have a model trained on your current dataset, to use the model to predict/classify the sample image, see how well it performs, and with what confidence it performs. This way, perhaps you can assess whether your model (and the dataset you trained it with) have something to learn from this new sample image.
来源:https://stackoverflow.com/questions/55365075/measuring-how-a-new-sample-contributes-to-the-diversity-of-a-dataset