问题
I want to make a RNN using a Keras sequential model with a tensorflow backend. When I implement the following code:
batch_size = 8
batch_inputshape = (batch_size,x_train.shape[1],x_train.shape[2])
print(batch_inputshape) #(8, 600, 103)
model = Sequential()
model.add(LSTM(103,
batch_input_shape = batch_inputshape,
return_sequences = True,
stateful = True))
model.add(Dropout(0.2))
model.add(LSTM(50,
return_sequences = True,
stateful = True))
model.add(Dropout(0.2))
model.add(TimeDistributed(Dense(10)))
model.add(TimeDistributed(Dense(2)))
model.add(Activation('softmax'))
model.compile(loss= ncce, optimizer='adam')
print (model.output_shape) #(8, 600, 2)
model.fit(x_train,y_train, batch_size = batch_size,
nb_epoch = 1, validation_split=0.25)
I get the follow error message:
Input to reshape is a tensor with 16 values, but the requested shape has 8
But whatever I change the batch_size to the error will just follow the following formula:
Input to reshape is a tensor with 2 * batch_size
values, but the requested shape has batch_size
I have looked at other Q&A, but I do not think they help me much. Or I dont understand the answers well enough.
Any help would be much appreciated!
EDIT: as requested the shape of input and target:
print(x_train.shape) #(512,600,103)
print(y_train.shape) #(512,600,2)
EDIT 2:
from functools import partial
import keras.backend as K
from itertools import product
def w_categorical_crossentropy(y_true, y_pred, weights):
# https://github.com/fchollet/keras/issues/2115#issuecomment-274101310 #
nb_cl = len(weights)
final_mask = K.zeros_like(y_pred[:, 0])
y_pred_max = K.max(y_pred, axis=1)
y_pred_max = K.reshape(y_pred_max, (K.shape(y_pred)[0], 1))
y_pred_max_mat = K.cast(K.equal(y_pred, y_pred_max), K.floatx())
for c_p, c_t in product(range(nb_cl), range(nb_cl)):
final_mask += (weights[c_t, c_p] * y_pred_max_mat[:, c_p] * y_true[:, c_t])
return K.categorical_crossentropy(y_pred, y_true) * final_mask
w_array = np.ones((2,2))
w_array[1, 0] = 100
print(w_array)
ncce = partial(w_categorical_crossentropy, weights=w_array)
ncce.__name__ ='w_categorical_crossentropy
EDIT 3: UPDATE
With help of @Nassim Ben, he figured out that the problem is in the loss function. He posted code with a regular loss function and then it works just fine. However with the custom loss function that code does not work. As any readers of this question can see I posted my costum loss function above and there is the problem. Currently I do not yet know why this error exist but this is the current status.
回答1:
EDIT : This code works for me, I have only changed the loss for simplicity.
import keras
from keras.layers import *
from keras.models import Sequential
from keras.objectives import *
import numpy as np
x_train = np.random.random((512,600, 103))
y_train = np.random.random((512,600,2))
batch_size = 8
batch_inputshape = (batch_size,x_train.shape[1],x_train.shape[2])
print(batch_inputshape) #(8, 600, 103)
model = Sequential()
model.add(LSTM(103,
batch_input_shape = batch_inputshape,
return_sequences = True,
stateful = True))
model.add(Dropout(0.2))
model.add(LSTM(50,
return_sequences = True,
stateful = True))
model.add(Dropout(0.2))
model.add(TimeDistributed(Dense(10)))
model.add(TimeDistributed(Dense(2)))
model.add(Activation('softmax'))
model.compile(loss= "mse", optimizer='adam')
print (model.output_shape) #(8, 600, 2)
model.fit(x_train,y_train, batch_size = batch_size,
nb_epoch = 1, validation_split=0.25)
EDIT 2:
So the error was coming from the loss function. In the code you copied from github for ncce loss, they had outputs of shape (batch,10). You have outputs of shape (batch, 600, 2). So here is my edit of the function :
def w_categorical_crossentropy(y_true, y_pred, weights):
# https://github.com/fchollet/keras/issues/2115#issuecomment-274101310 #
nb_cl = len(weights)
# Create a mask with zeroes
final_mask = K.zeros_like(y_pred[:,:,0])
# get the maximum probability value for every output (shape = (batch,600,1))
y_pred_max = K.max(y_pred, axis=2, keepdims=True)
# Get the actual predictions for every output (shape = (batch,600,2))
# This K.equal uses broadcasting, we compare two tensors of different sizes but it works (magic)
y_pred_max_mat = K.equal(y_pred, y_pred_max)
for c_p, c_t in product(range(nb_cl), range(nb_cl)):
# Create the mask of weights to apply to the result of the cat_crossentropy
final_mask += (weights[c_t, c_p] * K.cast(y_pred_max_mat[:,:, c_p], K.floatx()) * y_true[:,:, c_t])
return K.categorical_crossentropy(y_pred, y_true) * final_mask
w_array = np.ones((2,2))
w_array[1, 0] = 100
As you can see, I just modified the index play because of your particular shape. The mask has to be of shape (batch, 600). The max has to be done on the 3rd dimension because there lie the probabilities that you want to output. The matrix multiplication to build the max needed to be updated too because of the shape of your tensors again.
This should work.
If you need more detailed explaination feel free to ask :-)
来源:https://stackoverflow.com/questions/42115585/input-to-reshape-is-a-tensor-with-2-batch-size-values-but-the-requested-sha