问题
For an assignment I have to implement both the Hinge loss and its partial derivative calculation functions. I got the Hinge loss function itself but I'm having hard time understanding how to calculate its partial derivative w.r.t. prediction input. I tried different approaches but none worked.
Any help, hints, suggestions will be much appreciated!
Here is the analytical expression for Hinge loss function itself:
And here is my Hinge loss function implementation:
def hinge_forward(target_pred, target_true):
"""Compute the value of Hinge loss
for a given prediction and the ground truth
# Arguments
target_pred: predictions - np.array of size `(n_objects,)`
target_true: ground truth - np.array of size `(n_objects,)`
# Output
the value of Hinge loss
for a given prediction and the ground truth
scalar
"""
output = np.sum((np.maximum(0, 1 - target_pred * target_true)) / target_pred.size)
return output
Now I need to calculate this gradient:
This is what I tried for the Hinge loss gradient calculation:
def hinge_grad_input(target_pred, target_true):
"""Compute the partial derivative
of Hinge loss with respect to its input
# Arguments
target_pred: predictions - np.array of size `(n_objects,)`
target_true: ground truth - np.array of size `(n_objects,)`
# Output
the partial derivative
of Hinge loss with respect to its input
np.array of size `(n_objects,)`
"""
# ----------------
# try 1
# ----------------
# hinge_result = hinge_forward(target_pred, target_true)
# if hinge_result == 0:
# grad_input = 0
# else:
# hinge = np.maximum(0, 1 - target_pred * target_true)
# grad_input = np.zeros_like(hinge)
# grad_input[hinge > 0] = 1
# grad_input = np.sum(np.where(hinge > 0))
# ----------------
# try 2
# ----------------
# hinge = np.maximum(0, 1 - target_pred * target_true)
# grad_input = np.zeros_like(hinge)
# grad_input[hinge > 0] = 1
# ----------------
# try 3
# ----------------
hinge_result = hinge_forward(target_pred, target_true)
if hinge_result == 0:
grad_input = 0
else:
loss = np.maximum(0, 1 - target_pred * target_true)
grad_input = np.zeros_like(loss)
grad_input[loss > 0] = 1
grad_input = np.sum(grad_input) * target_pred
return grad_input
回答1:
I've managed to solve this by using np.where() function. Here is the code:
def hinge_grad_input(target_pred, target_true):
"""Compute the partial derivative
of Hinge loss with respect to its input
# Arguments
target_pred: predictions - np.array of size `(n_objects,)`
target_true: ground truth - np.array of size `(n_objects,)`
# Output
the partial derivative
of Hinge loss with respect to its input
np.array of size `(n_objects,)`
"""
grad_input = np.where(target_pred * target_true < 1, -target_true / target_pred.size, 0)
return grad_input
Basically the gradient equals -y/N for all the cases where y*y < 1, otherwise 0.
来源:https://stackoverflow.com/questions/53244095/hinge-loss-function-gradient-w-r-t-input-prediction