问题
As much as I know that Triplet Loss
is a Loss Function which decrease the distance between anchor and positive but decrease between anchor and negative. Also, there is a margin added to it.
So for EXAMPLE LEt us Suppose: a Siamese Network
, which gives embeddings:
anchor_output = [1,2,3,4,5...] # embedding given by the CNN model
positive_output = [1,2,3,4,4...]
negative_output= [53,43,33,23,13...]
And I think I can get the triplet loss such as: (I think I have to make it as loss using Lambda Layer or so)
# calculate triplet loss
d_pos = tf.reduce_sum(tf.square(anchor_output - positive_output), 1)
d_neg = tf.reduce_sum(tf.square(anchor_output - negative_output), 1)
loss = tf.maximum(0., margin + d_pos - d_neg)
loss = tf.reduce_mean(loss)
So what on the earth is: tfa.losses.TripletHardLoss and tfa.losses.TripletSemiHardLoss
As much as I know, Semi and hard are type of data generation techniques for Siamese Techniques
which push the model to learn more.
MY Thinking: As I have learned it in This Post, I think you can do:
- Generate a Batch of say 3 images and make a pair of 3 having
27
images - Discard every invalid pair (all i,j,k should be unique). Remaining Batch
B
- Get the embeddings on each pair in batch
B
So I think HardTripletLoss
takes account of only those 3 images per batch which had Biggest Anchor-Positive distance and Lowest Anchor- Negative distance.
And for Semi Hard
, I think it discards all the losses calculated by every image pair where the distance was 0.
if not, Could someone please correct me and tell me how these can be used. (I know we can use it inside model.complie()
but my question is different.
回答1:
What is TripletHardLoss
?
This loss follow the ordinary TripletLoss
form, but using the maximum positive distance and minimum negative distance plus the margin constant within the batch when computing the loss, as we can see in the formula:
Look into source code of tfa.losses.TripletHardLoss
we can see above formula been implement exactly:
# Build pairwise binary adjacency matrix.
adjacency = tf.math.equal(labels, tf.transpose(labels))
# Invert so we can select negatives only.
adjacency_not = tf.math.logical_not(adjacency)
adjacency_not = tf.cast(adjacency_not, dtype=tf.dtypes.float32)
# hard negatives: smallest D_an.
hard_negatives = _masked_minimum(pdist_matrix, adjacency_not)
batch_size = tf.size(labels)
adjacency = tf.cast(adjacency, dtype=tf.dtypes.float32)
mask_positives = tf.cast(adjacency, dtype=tf.dtypes.float32) - tf.linalg.diag(
tf.ones([batch_size])
)
# hard positives: largest D_ap.
hard_positives = _masked_maximum(pdist_matrix, mask_positives)
if soft:
triplet_loss = tf.math.log1p(tf.math.exp(hard_positives - hard_negatives))
else:
triplet_loss = tf.maximum(hard_positives - hard_negatives + margin, 0.0)
# Get final mean triplet loss
triplet_loss = tf.reduce_mean(triplet_loss)
Note the soft
parameter in tfa.losses.TripletHardLoss
are not using following formula to calculate the ordinary TripletLoss
:
Because as we can see in above source code, it still using maximum positive distance and minimum negative distance, it determine using the soft margin or not
What is TripletSemiHardLoss
?
This loss also follow the ordinary TripletLoss
form, positive distances is same as in ordinary TripletLoss
and negative distance using semi-hard negative:
Minimum negative distance among which are at least greater than the positive distance plus the margin constant, if no such negative exists, uses the largest negative distance instead.
i.e we want first find negative distance that satisfies following condition:
p
for positive and n
for negative, if wan can't find the negative distance that satisfies this condition then we using largest negative distance instead.
As we can see above condition process clear in source code of tfa.losses.TripletSemiHardLoss
, where negatives_outside
is distance that satisfies this condition and negatives_inside
is largest negative distance:
# Build pairwise binary adjacency matrix.
adjacency = tf.math.equal(labels, tf.transpose(labels))
# Invert so we can select negatives only.
adjacency_not = tf.math.logical_not(adjacency)
batch_size = tf.size(labels)
# Compute the mask.
pdist_matrix_tile = tf.tile(pdist_matrix, [batch_size, 1])
mask = tf.math.logical_and(
tf.tile(adjacency_not, [batch_size, 1]),
tf.math.greater(
pdist_matrix_tile, tf.reshape(tf.transpose(pdist_matrix), [-1, 1])
),
)
mask_final = tf.reshape(
tf.math.greater(
tf.math.reduce_sum(
tf.cast(mask, dtype=tf.dtypes.float32), 1, keepdims=True
),
0.0,
),
[batch_size, batch_size],
)
mask_final = tf.transpose(mask_final)
adjacency_not = tf.cast(adjacency_not, dtype=tf.dtypes.float32)
mask = tf.cast(mask, dtype=tf.dtypes.float32)
# negatives_outside: smallest D_an where D_an > D_ap.
negatives_outside = tf.reshape(
_masked_minimum(pdist_matrix_tile, mask), [batch_size, batch_size]
)
negatives_outside = tf.transpose(negatives_outside)
# negatives_inside: largest D_an.
negatives_inside = tf.tile(
_masked_maximum(pdist_matrix, adjacency_not), [1, batch_size]
)
semi_hard_negatives = tf.where(mask_final, negatives_outside, negatives_inside)
loss_mat = tf.math.add(margin, pdist_matrix - semi_hard_negatives)
mask_positives = tf.cast(adjacency, dtype=tf.dtypes.float32) - tf.linalg.diag(
tf.ones([batch_size])
)
# In lifted-struct, the authors multiply 0.5 for upper triangular
# in semihard, they take all positive pairs except the diagonal.
num_positives = tf.math.reduce_sum(mask_positives)
triplet_loss = tf.math.truediv(
tf.math.reduce_sum(
tf.math.maximum(tf.math.multiply(loss_mat, mask_positives), 0.0)
),
num_positives,
)
How to use those loss?
Both loss expect y_true
to be provided as 1-D integer Tensor
with shape [batch_size] of multi-class integer labels. And embeddings y_pred
must be 2-D float Tensor
of l2 normalized embedding vectors.
Example code to prepare the inputs and labels:
import tensorflow as tf
import tensorflow_addons as tfa
import tensorflow_datasets as tfds
def _normalize_img(img, label):
img = tf.cast(img, tf.float32) / 255.
return (img, label)
train_dataset, test_dataset = tfds.load(name="mnist", split=['train', 'test'], as_supervised=True)
# Build your input pipelines
train_dataset = train_dataset.shuffle(1024).batch(16)
train_dataset = train_dataset.map(_normalize_img)
# Take one batch of data
for data in train_dataset.take(1):
print("Batch of images shape:\n{}\nBatch of labels:\n{}\n".format(data[0].shape, data[1]))
Outputs:
Batch of images shape:
(16, 28, 28, 1)
Batch of labels:
[8 4 0 3 2 4 5 1 0 5 7 0 2 6 4 9]
Following this official tutorial about how to using TripletSemiHardLoss (TripletHardLoss as well) in general if you have problem when using it.
来源:https://stackoverflow.com/questions/65579247/how-does-the-tensorflows-tripletsemihardloss-and-triplethardloss-and-how-to-use