Just adding this clarification so that anyone who scrolls down this much can at least gets it right, since there are so many wrong answers upvoted.
Diansheng's answer and JakeJ's answer get it right.
A new answer posted by Shital Shah is an even better and more complete answer.
Yes, logit
as a mathematical function in statistics, but the logit
used in context of neural networks is different. Statistical logit
doesn't even make any sense here.
I couldn't find a formal definition anywhere, but logit
basically means:
The raw predictions which come out of the last layer of the neural network.
1. This is the very tensor on which you apply the argmax function to get the predicted class.
2. This is the very tensor which you feed into the softmax function to get the probabilities for the predicted classes.
Also, from a tutorial on official tensorflow website:
Logits Layer
The final layer in our neural network is the logits layer, which will return the raw values for our predictions. We create a dense layer with 10 neurons (one for each target class 0–9), with linear activation (the default):
logits = tf.layers.dense(inputs=dropout, units=10)
If you are still confused, the situation is like this:
raw_predictions = neural_net(input_layer)
predicted_class_index_by_raw = argmax(raw_predictions)
probabilities = softmax(raw_predictions)
predicted_class_index_by_prob = argmax(probabilities)
where, predicted_class_index_by_raw
and predicted_class_index_by_prob
will be equal.
Another name for raw_predictions
in the above code is logit
.
As for the why logit
... I have no idea. Sorry.
[Edit: See this answer for the historical motivations behind the term.]
Trivia
Although, if you want to, you can apply statistical logit
to probabilities
that come out of the softmax
function.
If the probability of a certain class is p
,
Then the log-odds of that class is L = logit(p)
.
Also, the probability of that class can be recovered as p = sigmoid(L)
, using the sigmoid function.
Not very useful to calculate log-odds though.