Interpreting a sigmoid result as probability in neural networks

问题

I've created a neural network with a sigmoid activation function in the last layer, so I get results between 0 and 1. I want to classify things in 2 classes, so I check "is the number > 0.5, then class 1 else class 0". All basic. However, I would like to say "the probability of it being in class 0 is x and in class 1 is y".

How can I do this?

Does a number like 0.73 tell me it's 73% sure to be in class 1? And then 1-0.73 = 0.27 so 27% in class 0?
When it's 0.27, does that mean it's 27% sure in class 0, 73% in class 1? Makes no sense.

Should I work with the 0.5 and look "how far away from the center is the number, and then that's the percentage"?

Or am I misunderstanding the result of the NN?

回答1:

As pointed out by Teja, the short answer is no, however, depending on the loss you use, it may be closer to truth than you may think.

Imagine you try to train your network to differentiate numbers into two arbitrary categories that are beautiful and ugly. Say your input number are either 0 or 1 and 0s have a 0.2 probability of being labelled ugly whereas 1s have o 0.6probability of being ugly.

Imagine that your neural network takes as inputs 0s and 1s, passes them into some layers, and ends in a softmax function. If your loss is binary cross-entropy, then the optimal solution for your network is to output 0.2 when it sees a 0 in input and 0.6 when it sees a 1 in input (this is a property of the cross-entropy which is minimized when you output the true probabilities of each label). Therefore, you can interpret these numbers as probabilities.

Of course, real world examples are not that easy and are generally deterministic so the interpretation is a little bit tricky. However, I believe that it is not entirely false to think of your results as probabilities as long as you use the cross-entropy as a loss.

I'm sorry, this answer is not black or white, but reality is sometimes complex ;)

回答2:

Does a number like 0.73 tell me it's 73% sure to be in class 1? And then 1-0.73 = 0.27 so 27% in class 0?

The Answer is No. When we are using Sigmoid Function the sum of the results will not sum to 1.There are chances that sum of results of the classes will be less than 1 or in some cases it will be greater than 1.

In the same case,when we use the softmax function. The sum of all the outputs will be added to 1.

来源：https://stackoverflow.com/questions/57903518/interpreting-a-sigmoid-result-as-probability-in-neural-networks

标签

python

tensorflow

sigmoid