Gradient calculation for softmax version of triplet loss

我的未来我决定 提交于 2019-12-01 08:51:49

问题


I have been trying to implement the softmax version of the triplet loss in Caffe described in
Hoffer and Ailon, Deep Metric Learning Using Triplet Network, ICLR 2015.

I have tried this but I am finding it hard to calculate the gradient as the L2 in exponent is not squared.

Can someone please help me here?


回答1:


Implementing the L2 norm using existing layers of caffe can save you all the hustle.

Here's one way to compute ||x1-x2||_2 in caffe for "bottom"s x1 and x2 (assuming x1 and x2 are B-by-C blobs, computing B norms for C dimensional diffs)

layer {
  name: "x1-x2"
  type: "Eltwise"
  bottom: "x1"
  bottom: "x1"
  top: "x1-x2"
  eltwise_param { 
    operation: SUM
    coeff: 1 coeff: -1
  }
}
layer {
  name: "sqr_norm"
  type: "Reduction"
  bottom: "x1-x2"
  top: "sqr_norm"
  reduction_param { operation: SUMSQ axis: 1 }
}
layer {
  name: "sqrt"
  type: "Power"
  bottom: "sqr_norm"
  top: "sqrt"
  power_param { power: 0.5 }
}

For the triplet loss defined in the paper, you need to compute L2 norm for x-x+ and for x-x-, concat these two blobs and feed the concat blob to a "Softmax" layer.
No need for dirty gradient computations.




回答2:


This is a math question, but here it goes. The first equation is what you're used to, and the second is what you do when it's not squared.



来源:https://stackoverflow.com/questions/36277060/gradient-calculation-for-softmax-version-of-triplet-loss

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!