问题
I want to set Euclidean distance as a loss function for LSTM or RNN.
What output should such function have: float, (batch_size) or (batch_size, timesteps)?
Model input X_train is (n_samples, timesteps, data_dim). Y_train has the same dimensions.
Example code:
def euc_dist_keras(x, y):
return K.sqrt(K.sum(K.square(x - y), axis=-1, keepdims=True))
model = Sequential()
model.add(SimpleRNN(n_units, activation='relu', input_shape=(timesteps, data_dim), return_sequences=True))
model.add(Dense(n_output, activation='linear'))
model.compile(loss=euc_dist_keras, optimizer='adagrad')
model.fit(y_train, y_train, batch_size=512, epochs=10)
So, should I average loss over timesteps dimension and/or batch_size?
回答1:
A loss functions will take predicted and true labels and will output a scalar, in Keras:
from keras import backend as K
def euc_dist_keras(y_true, y_pred):
return K.sqrt(K.sum(K.square(y_true - y_pred), axis=-1, keepdims=True))
Note, that it will not take X_train
as an input. The loss calculation follows the forward propagation step, and it's value provides the goodness of predicted labels compared to true labels.
What output should such function have: float, (batch_size) or (batch_size, timesteps)?
The loss function should have scalar output.
So, should I average loss over timesteps dimension and/or batch_size?
This would not be required to be able to use Euclidean distance as a loss function.
Side note: In your case, I think the problem might be with the neural network architecture, not the loss. Given (batch_size, timesteps, data_dim)
the output of the SimpleRNN
will be (batch_size, timesteps, n_units)
, and the output of Dense
layer will be (batch_size, n_output)
. Thus, given your Y_train
has the shape (batch_size, timesteps, data_dim)
you would likely need to use TimeDistributed
wrapper applying Dense
per every temporal slice, and to adjust the number of hidden units in the fully connected layer.
来源:https://stackoverflow.com/questions/46594115/euclidean-distance-loss-function-for-rnn-keras