NaN loss when training regression network

后端 未结 17 2245
渐次进展
渐次进展 2020-11-29 16:28

I have a data matrix in \"one-hot encoding\" (all ones and zeros) with 260,000 rows and 35 columns. I am using Keras to train a simple neural network to predict a continuou

相关标签:
17条回答
  • 2020-11-29 16:40

    I tried every suggestion on this page and many others to no avail. We were importing csv files with pandas, then using keras Tokenizer with text input to create vocabularies and word vector matrices. After noticing some CSV files led to nan while others worked, suddenly we looked at the encoding of the files and realized that ascii files were NOT working with keras, leading to nan loss and accuracy of 0.0000e+00; however, utf-8 and utf-16 files were working! Breakthrough.

    If you're performing textual analysis and getting nan loss after trying these suggestions, use file -i {input} (linux) or file -I {input} (osx) to discover your file type. If you have ISO-8859-1 or us-ascii, try converting to utf-8 or utf-16le. Haven't tried the latter but I'd imagine it would work as well. Hopefully this helps someone very very frustrated!

    0 讨论(0)
  • 2020-11-29 16:40

    In my case the issue was that I copy-pasted my previous work for binary classification and used sigmoid activation on the output layer instead of softmax (the new network was about multiclass classification).

    0 讨论(0)
  • 2020-11-29 16:41

    I had similar issue with my logloss, MAE and others being all NA's. I looked into the data and found, I had few features with NA's in them. I imputed NA's with approximate values and was able to solve the issue.

    0 讨论(0)
  • 2020-11-29 16:42

    I faced the same problem with using LSTM, the problem is my data has some nan value after standardization, therefore, we should check the input model data after the standarization if you see you will have nan value:

    print(np.any(np.isnan(X_test)))
    print(np.any(np.isnan(y_test)))
    

    you can solve this by adding a small value(0.000001) to Std like this,

    def standardize(train, test):
    
    
        mean = np.mean(train, axis=0)
        std = np.std(train, axis=0)+0.000001
    
        X_train = (train - mean) / std
        X_test = (test - mean) /std
        return X_train, X_test
    
    0 讨论(0)
  • 2020-11-29 16:42

    I was getting the same thing when I tried creating a bounding box regressor. My neural net had larger layer than yours . I increased the dropout value and got suitable results.

    0 讨论(0)
  • 2020-11-29 16:46

    To sum up the different solutions mentioned here and from this github discussion, which would depend of course on your particular situation:

    • Add regularization to add l1 or l2 penalties to the weights. Otherwise, try a smaller l2 reg. i.e l2(0.001), or remove it if already exists.
    • Try a smaller Dropout rate.
    • Clip the gradients to prevent their explosion. For instance in Keras you could use clipnorm=1. or clipvalue=1. as parameters for your optimizer.
    • Check validity of inputs (no NaNs or sometimes 0s). i.e df.isnull().any()
    • Replace optimizer with Adam which is easier to handle. Sometimes also replacing sgd with rmsprop would help.
    • Use RMSProp with heavy regularization to prevent gradient explosion.
    • Try normalizing your data, or inspect your normalization process for any bad values introduced.
    • Verify that you are using the right activation function (e.g. using a softmax instead of sigmoid for multiple class classification).
    • Try to increase the batch size (e.g. 32 to 64 or 128) to increase the stability of your optimization.
    • Try decreasing your learning rate.
    • Check the size of your last batch which may be different from the batch size.
    0 讨论(0)
提交回复
热议问题