问题
I am trying to predict a continuous value (using a Neural Network for the first time). I have normalized the input data. I can't figure out why I am getting a loss: nan
output starting with the first epoch.
I read and tried many suggestions from previous answers to the same question but that none of them helped me. My training data shape is: (201917, 64)
.
Here's my code:
model = Sequential()
model.add(Dense(100, input_dim=X.shape[1], activation='relu'))
model.add(Dense(100, activation='relu'))
model.add(Dense(100, activation='relu'))
# Output layer
model.add(Dense(1, activation='linear'))
# Construct the neural network inside of TensorFlow
model.compile(loss='mean_squared_error', optimizer='Adam')
# train the model
model.fit(X_train, y_train, epochs=10, batch_size=32,
shuffle=True, verbose=2)
回答1:
In short, these are the steps that you can take to find the cause of your problem:
Make sure that your dataset is what it should be:
- Look for any nan/inf in your dataset and fix it.
- Incorrect encoding (convert it to UTF-8).
- Invalid values in your column or rows.
Normalize your model using Dropout, BatchNormalization, L1/L2 regularization, change your batch_size, or scale your data to other ranges (e.g. [-1, 1]).
- Reduce the size of your network.
- Change other hyper-parameters (e.g. optimizer or activation function).
You can check this and this link to get extra help.
回答2:
sometimes one gets nan loss when the learning rate is too high. One solution could be to lessen it.
replace this code:
# Construct the neural network inside of TensorFlow
model.compile(loss='mean_squared_error', optimizer='Adam')
with:
from keras.optimizers import Adam #maybe put this at the top of your file
opt = Adam(lr=0.0001) #0.001 was the default, so try a smaller one
model.compile(optimizer=opt, loss='mean_squared_error')
see if that helps. I would also try with one hidden layer first and see how it goes.
回答3:
NaN in the input data frame
Before getting the data frame values, the NaN values should be replaced. Otherwise, it will explode the gradient.
来源:https://stackoverflow.com/questions/53640858/loss-nan-in-keras-while-performing-regression