Gradient calculation for softmax version of triplet loss

前端未结

关注

 2  595

I have been trying to implement the softmax version of the triplet loss in Caffe described in
Hoffer and Ailon, Deep Metric Learning Using Triplet Network, ICLR 2015.

相关标签:

2条回答

无人共我

2021-01-15 16:02

This is a math question, but here it goes. The first equation is what you're used to, and the second is what you do when it's not squared.

0 讨论(0)
发布评论:

提交评论
- 加载中...
走了就别回头了

2021-01-15 16:04
Implementing the L2 norm using existing layers of caffe can save you all the hustle.

Here's one way to compute ||x1-x2||_2 in caffe for "bottom"s x1 and x2 (assuming x1 and x2 are B-by-C blobs, computing B norms for C dimensional diffs)
```
layer {
  name: "x1-x2"
  type: "Eltwise"
  bottom: "x1"
  bottom: "x1"
  top: "x1-x2"
  eltwise_param { 
    operation: SUM
    coeff: 1 coeff: -1
  }
}
layer {
  name: "sqr_norm"
  type: "Reduction"
  bottom: "x1-x2"
  top: "sqr_norm"
  reduction_param { operation: SUMSQ axis: 1 }
}
layer {
  name: "sqrt"
  type: "Power"
  bottom: "sqr_norm"
  top: "sqrt"
  power_param { power: 0.5 }
}
```
For the triplet loss defined in the paper, you need to compute L2 norm for x-x+ and for x-x-, concat these two blobs and feed the concat blob to a "Softmax" layer.
No need for dirty gradient computations.
0 讨论(0)
发布评论:

提交评论
- 加载中...