问题
I've created a neural network with a sigmoid activation function in the last layer, so I get results between 0 and 1. I want to classify things in 2 classes, so I check "is the number > 0.5, then class 1 else class 0". All basic.
However, I would like to say "the probability of it being in class 0 is x
and in class 1 is y
".
How can I do this?
- Does a number like 0.73 tell me it's 73% sure to be in class 1? And then 1-0.73 = 0.27 so 27% in class 0?
- When it's 0.27, does that mean it's 27% sure in class 0, 73% in class 1? Makes no sense.
Should I work with the 0.5 and look "how far away from the center is the number, and then that's the percentage"?
Or am I misunderstanding the result of the NN?
回答1:
As pointed out by Teja, the short answer is no, however, depending on the loss you use, it may be closer to truth than you may think.
Imagine you try to train your network to differentiate numbers into two arbitrary categories that are beautiful
and ugly
. Say your input number are either 0
or 1
and 0
s have a 0.2
probability of being labelled ugly
whereas 1
s have o 0.6
probability of being ugly
.
Imagine that your neural network takes as inputs 0
s and 1
s, passes them into some layers, and ends in a softmax function. If your loss is binary cross-entropy, then the optimal solution for your network is to output 0.2
when it sees a 0
in input and 0.6
when it sees a 1
in input (this is a property of the cross-entropy which is minimized when you output the true probabilities of each label). Therefore, you can interpret these numbers as probabilities.
Of course, real world examples are not that easy and are generally deterministic so the interpretation is a little bit tricky. However, I believe that it is not entirely false to think of your results as probabilities as long as you use the cross-entropy as a loss.
I'm sorry, this answer is not black or white, but reality is sometimes complex ;)
回答2:
Does a number like 0.73 tell me it's 73% sure to be in class 1? And then 1-0.73 = 0.27 so 27% in class 0?
The Answer is No. When we are using Sigmoid Function the sum of the results will not sum to 1.There are chances that sum of results of the classes will be less than 1 or in some cases it will be greater than 1.
In the same case,when we use the softmax function. The sum of all the outputs will be added to 1.
来源:https://stackoverflow.com/questions/57903518/interpreting-a-sigmoid-result-as-probability-in-neural-networks