NaN loss when training regression network

后端未结

关注

 17  2245

I have a data matrix in \"one-hot encoding\" (all ones and zeros) with 260,000 rows and 35 columns. I am using Keras to train a simple neural network to predict a continuou

相关标签:

17条回答

暗喜

2020-11-29 16:40

I tried every suggestion on this page and many others to no avail. We were importing csv files with pandas, then using keras Tokenizer with text input to create vocabularies and word vector matrices. After noticing some CSV files led to nan while others worked, suddenly we looked at the encoding of the files and realized that ascii files were NOT working with keras, leading to nan loss and accuracy of 0.0000e+00; however, utf-8 and utf-16 files were working! Breakthrough.

If you're performing textual analysis and getting nan loss after trying these suggestions, use file -i {input} (linux) or file -I {input} (osx) to discover your file type. If you have ISO-8859-1 or us-ascii, try converting to utf-8 or utf-16le. Haven't tried the latter but I'd imagine it would work as well. Hopefully this helps someone very very frustrated!

0 讨论(0)
发布评论:

提交评论
- 加载中...
傲寒

2020-11-29 16:40

In my case the issue was that I copy-pasted my previous work for binary classification and used sigmoid activation on the output layer instead of softmax (the new network was about multiclass classification).

0 讨论(0)
发布评论:

提交评论
- 加载中...
北海茫月

2020-11-29 16:41

I had similar issue with my logloss, MAE and others being all NA's. I looked into the data and found, I had few features with NA's in them. I imputed NA's with approximate values and was able to solve the issue.

0 讨论(0)
发布评论:

提交评论
- 加载中...
别那么骄傲

2020-11-29 16:42
I faced the same problem with using LSTM, the problem is my data has some nan value after standardization, therefore, we should check the input model data after the standarization if you see you will have nan value:
```
print(np.any(np.isnan(X_test)))
print(np.any(np.isnan(y_test)))
```
you can solve this by adding a small value(0.000001) to Std like this,
```
def standardize(train, test):


    mean = np.mean(train, axis=0)
    std = np.std(train, axis=0)+0.000001

    X_train = (train - mean) / std
    X_test = (test - mean) /std
    return X_train, X_test
```
0 讨论(0)
发布评论:

提交评论
- 加载中...
南笙

2020-11-29 16:42

I was getting the same thing when I tried creating a bounding box regressor. My neural net had larger layer than yours . I increased the dropout value and got suitable results.

0 讨论(0)
发布评论:

提交评论
- 加载中...
谎友^

2020-11-29 16:46
To sum up the different solutions mentioned here and from this github discussion, which would depend of course on your particular situation:
- Add regularization to add l1 or l2 penalties to the weights. Otherwise, try a smaller l2 reg. i.e l2(0.001), or remove it if already exists.
- Try a smaller Dropout rate.
- Clip the gradients to prevent their explosion. For instance in Keras you could use clipnorm=1. or clipvalue=1. as parameters for your optimizer.
- Check validity of inputs (no NaNs or sometimes 0s). i.e df.isnull().any()
- Replace optimizer with Adam which is easier to handle. Sometimes also replacing sgd with rmsprop would help.
- Use RMSProp with heavy regularization to prevent gradient explosion.
- Try normalizing your data, or inspect your normalization process for any bad values introduced.
- Verify that you are using the right activation function (e.g. using a softmax instead of sigmoid for multiple class classification).
- Try to increase the batch size (e.g. 32 to 64 or 128) to increase the stability of your optimization.
- Try decreasing your learning rate.
- Check the size of your last batch which may be different from the batch size.
0 讨论(0)
发布评论:

提交评论
- 加载中...