(An update to this question has been added.)
I am a graduate student at the university of Ghent, Belgium; my research is about emotion recognition w
Why don't you use the InfogainLoss layer to compensate for the imbalance in your training set?
The Infogain loss is defined using a weight matrix H
(in your case 2-by-2) The meaning of its entries are
[cost of predicting 1 when gt is 0, cost of predicting 0 when gt is 0
cost of predicting 1 when gt is 1, cost of predicting 0 when gt is 1]
So, you can set the entries of H
to reflect the difference between errors in predicting 0 or 1.
You can find how to define matrix H
for caffe in this thread.
Regarding sample weights, you may find this post interesting: it shows how to modify the SoftmaxWithLoss layer to take into account sample weights.
Recently, a modification to cross-entropy loss was proposed by Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He, Piotr Dollár Focal Loss for Dense Object Detection, (ICCV 2017).
The idea behind focal-loss is to assign different weight for each example based on the relative difficulty of predicting this example (rather based on class size etc.). From the brief time I got to experiment with this loss, it feels superior to "InfogainLoss"
with class-size weights.
I have also come across this class imbalance problem in my classification task. Right now I am using CrossEntropyLoss with weight (documentation here) and it works fine. The idea is to give more loss to samples in classes with smaller number of images.
weight for each class in inversely proportional to the image number in this class. Here is a snippet to calculate weight for all class using numpy,
cls_num = []
# train_labels is a list of class labels for all training samples
# the labels are in range [0, n-1] (n classes in total)
train_labels = np.asarray(train_labels)
num_cls = np.unique(train_labels).size
for i in range(num_cls):
cls_num.append(len(np.where(train_labels==i)[0]))
cls_num = np.array(cls_num)
cls_num = cls_num.max()/cls_num
x = 1.0/np.sum(cls_num)
# the weight is an array which contains weight to use in CrossEntropyLoss
# for each class.
weight = x*cls_num