Multiple category classification in Caffe

雨燕双飞 提交于 2019-12-17 15:59:06

问题


I thought we might be able to compile a Caffeinated description of some methods of performing multiple category classification.

By multi category classification I mean: The input data containing representations of multiple model output categories and/or simply being classifiable under multiple model output categories.

E.g. An image containing a cat & dog would output (ideally) ~1 for both the cat & dog prediction categories and ~0 for all others.

  1. Based on this paper, this stale and closed PR and this open PR, it seems caffe is perfectly capable of accepting labels. Is this correct?

  2. Would the construction of such a network require the use of multiple neuron (inner product -> relu -> inner product) and softmax layers as in page 13 of this paper; or does Caffe's ip & softmax presently support multiple label dimensions?

  3. When I'm passing my labels to the network which example would illustrate the correct approach (if not both)?:

    E.g. Cat eating apple Note: Python syntax, but I use the c++ source.

    Column 0 - Class is in input; Column 1 - Class is not in input

    [[1,0],  # Apple
     [0,1],  # Baseball
     [1,0],  # Cat
     [0,1]]  # Dog
    

    or

    Column 0 - Class is in input

    [[1],  # Apple
     [0],  # Baseball
     [1],  # Cat
     [0]]  # Dog
    

If anything lacks clarity please let me know and I will generate pictorial examples of the questions I'm trying to ask.


回答1:


Nice question. I believe there is no single "canonical" answer here and you may find several different approaches to tackle this problem. I'll do my best to show one possible way. It is slightly different than the question you asked, so I'll re-state the problem and suggest a solution.

The problem: given an input image and a set of C classes, indicate for each class if it is depicted in the image or not.

Inputs: in training time, inputs are pairs of image and a C-dim binary vector indicating for each class of the C classes if it is present in the image or not.

Output: given an image, output a C-dim binary vector (same as the second form suggested in your question).

Making caffe do the job: In order to make this work we need to modify the top layers of the net using a different loss.
But first, let's understand the usual way caffe is used and then look into the changes needed.
The way things are now: image is fed into the net, goes through conv/pooling/... layers and finally goes through an "InnerProduct" layer with C outputs. These C predictions goes into a "Softmax" layer that inhibits all but the most dominant class. Once a single class is highlighted "SoftmaxWithLoss" layer checks that the highlighted predicted class matches the ground truth class.

What you need: the problem with the existing approach is the "Softmax" layer that basically selects a single class. I suggest you replace it with a "Sigmoid" layer that maps each of the C outputs into an indicator whether this specific class is present in the image. For training, you should use "SigmoidCrossEntropyLoss" instead of the "SoftmaxWithloss" layer.




回答2:


Since one image can have multiple labels. The most intuitive way is to think this problem as a C independt binary classification problem where C is the total number of different classes. So it is easy to understand what @Shai have said:

add a "Sigmoid" layer that maps each of the C outputs into an indicator whether this specific class is present in the image, and should use "SigmoidCrossEntropyLoss" instead of the "SoftmaxWithloss" layer. The loss is the sum of these C SigmoidCrossEntropyLoss.



来源:https://stackoverflow.com/questions/33112941/multiple-category-classification-in-caffe

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!