Implementing custom loss function in keras with different sizes for y_true and y_pred

后端 未结 3 675
执念已碎
执念已碎 2020-12-29 00:19

I am new to Keras. I need some help in writing a custom loss function in keras with TensorFlow backend for the following loss equation.

相关标签:
3条回答
  • 2020-12-29 00:52

    Recent versions of Keras actually support losses with different shapes of y_pred and y_true. The build in loss sparse_categorical_crossentropy is an example of this. The TensorFlow implementation of this loss is here: https://github.com/keras-team/keras/blob/0fc33feb5f4efe3bb823c57a8390f52932a966ab/keras/backend/tensorflow_backend.py#L3570

    Notice how it says target: An integer tensor. and not target: A tensor of the same shape as `output`. like the others. I tried with a custom loss I made myself and it seems to work fine.

    I'm using Keras 2.2.4.

    0 讨论(0)
  • 2020-12-29 00:58

    You can pretty much just translate the numpy functions into Keras backend functions. The only thing to notice is to set up the right broadcast shape.

    def l2_loss_keras(y_true, y_pred):
        # set up meshgrid: (height, width, 2)
        meshgrid = K.tf.meshgrid(K.arange(im_height), K.arange(im_width))
        meshgrid = K.cast(K.transpose(K.stack(meshgrid)), K.floatx())
    
        # set up broadcast shape: (batch_size, height, width, num_joints, 2)
        meshgrid_broadcast = K.expand_dims(K.expand_dims(meshgrid, 0), -2)
        y_true_broadcast = K.expand_dims(K.expand_dims(y_true, 1), 2)
        diff = meshgrid_broadcast - y_true_broadcast
    
        # compute loss: first sum over (height, width), then take average over num_joints
        ground = K.exp(-0.5 * K.sum(K.square(diff), axis=-1) / sigma ** 2)
        loss = K.sum(K.square(ground - y_pred), axis=[1, 2])
        return K.mean(loss, axis=-1)
    

    To verify it:

    def l2_loss_numpy(y_true, y_pred):
         loss = 0
         n = y_true.shape[0]
         for j in range(n):
            for i in range(num_joints):
                yv, xv = np.meshgrid(np.arange(0, im_height), np.arange(0, im_width))
                z = np.stack([xv, yv]).transpose(1, 2, 0)
                ground = np.exp(-0.5*(((z - y_true[j, i, :])**2).sum(axis=2))/(sigma**2))
                loss = loss + np.sum((ground - y_pred[j,:, :, i])**2)
         return loss/num_joints
    
    batch_size = 32
    num_joints = 10
    sigma = 5
    im_width = 256
    im_height = 256
    
    y_true = 255 * np.random.rand(batch_size, num_joints, 2)
    y_pred = 255 * np.random.rand(batch_size, im_height, im_width, num_joints)
    
    print(l2_loss_numpy(y_true, y_pred))
    45448272129.0
    
    print(K.eval(l2_loss_keras(K.variable(y_true), K.variable(y_pred))).sum())
    4.5448e+10
    

    The number is truncated under the default dtype float32. If you run it with dtype set to float64:

    y_true = 255 * np.random.rand(batch_size, num_joints, 2)
    y_pred = 255 * np.random.rand(batch_size, im_height, im_width, num_joints)
    
    print(l2_loss_numpy(y_true, y_pred))
    45460126940.6
    
    print(K.eval(l2_loss_keras(K.variable(y_true), K.variable(y_pred))).sum())
    45460126940.6
    

    EDIT:

    It seems that Keras requires y_true and y_pred to have the same number of dimensions. For example, on the following testing model:

    X = np.random.rand(batch_size, 256, 256, 3)
    model = Sequential([Dense(10, input_shape=(256, 256, 3))])
    model.compile(loss=l2_loss_keras, optimizer='adam')
    model.fit(X, y_true, batch_size=8)
    
    ValueError: Cannot feed value of shape (8, 10, 2) for Tensor 'dense_2_target:0', which has shape '(?, ?, ?, ?)'
    

    To deal with this problem, you can add a dummy dimension with expand_dims before feeding y_true into the model:

    def l2_loss_keras(y_true, y_pred):
        ...
    
        y_true_broadcast = K.expand_dims(y_true, 1)  # change this line
    
        ...
    
    model.fit(X, np.expand_dims(y_true, axis=1), batch_size=8)
    
    0 讨论(0)
  • 2020-12-29 01:07

    Yu's answer is correct still I want to share my experience. Whenever you want to write custom loss function, beware of certain things:

    1. At compile time, Keras will not complain about size mismatch. For example, if y_pred which comes from the output layer of your model has 3-D shape, y_true will default to 3-D shape. BUT, at run time i.e. during fit, if you pass target data as 2-D which many people do, you might end up getting the error in your loss function. E.g., if you are calculating sigmoid_crossentropy_with_logits, it will complain. Hence, do pass targets as 3-D via np.expand_dims.
    2. Also, in custom loss make sure you use y_true and y_pred as arguments (if somebody knows that we can use any other names as arguments, pl shout).
    0 讨论(0)
提交回复
热议问题