scipy.optimize.curvefit() - array must not contain infs or NaNs

后端 未结 3 1507
孤城傲影
孤城傲影 2021-02-05 15:34

I am trying to fit some data to a curve in Python using scipy.optimize.curve_fit. I am running into the error ValueError: array must not contain infs or NaNs.

3条回答
  •  攒了一身酷
    2021-02-05 16:28

    I was able to reproduce this error in python2.7 like so:

    from sklearn.decomposition import FastICA
    X = load_data.load("stuff")    #this sets X to a 2d numpy array containing 
                                   #large positive and negative numbers.
    ica = FastICA(whiten=False)
    
    print(np.isnan(X).any())   #this prints False
    print(np.isinf(X).any())   #this prints False
    
    ica.fit(X)                 #this produces the error:
    

    Which always produces the Error:

    /usr/lib64/python2.7/site-packages/sklearn/decomposition/fastica_.py:58: RuntimeWarning: invalid value encountered in sqrt
      return np.dot(np.dot(u * (1. / np.sqrt(s)), u.T), W)
    Traceback (most recent call last):
      File "main.py", line 43, in 
        ica()
      File "main.py", line 18, in ica
        ica.fit(X)
      File "/usr/lib64/python2.7/site-packages/sklearn/decomposition/fastica_.py", line 523, in fit
        self._fit(X, compute_sources=False)
      File "/usr/lib64/python2.7/site-packages/sklearn/decomposition/fastica_.py", line 479, in _fit
        compute_sources=compute_sources, return_n_iter=True)
      File "/usr/lib64/python2.7/site-packages/sklearn/decomposition/fastica_.py", line 335, in fastica
        W, n_iter = _ica_par(X1, **kwargs)
      File "/usr/lib64/python2.7/site-packages/sklearn/decomposition/fastica_.py", line 108, in _ica_par
        - g_wtx[:, np.newaxis] * W)
      File "/usr/lib64/python2.7/site-packages/sklearn/decomposition/fastica_.py", line 55, in _sym_decorrelation
        s, u = linalg.eigh(np.dot(W, W.T))
      File "/usr/lib64/python2.7/site-packages/scipy/linalg/decomp.py", line 297, in eigh
        a1 = asarray_chkfinite(a)
      File "/usr/lib64/python2.7/site-packages/numpy/lib/function_base.py", line 613, in asarray_chkfinite
        "array must not contain infs or NaNs")
    ValueError: array must not contain infs or NaNs
    

    Solution:

    from sklearn.decomposition import FastICA
    X = load_data.load("stuff")    #this sets X to a 2d numpy array containing 
                                   #large positive and negative numbers.
    ica = FastICA(whiten=False)
    
    #this is a column wise normalization function which flattens the
    #two dimensional array from very large and very small numbers to 
    #reasonably sized numbers between roughly -1 and 1
    X = (X - np.mean(X, axis=0)) / np.std(X, axis=0)
    
    print(np.isnan(X).any())   #this prints False
    print(np.isinf(X).any())   #this prints False
    
    ica.fit(X)                 #this works correctly.
    

    Why does that normalization step fix the error?

    I found the eureka moment here: sklearn's PLSRegression: "ValueError: array must not contain infs or NaNs"

    What I think is happening is that numpy is being fed gigantic numbers and very tiny numbers, and inside it's tiny brain it's creating NaN's and Inf's. So it's a bug in the sklearn. The work around is to flatten your input data to the algorithm so that there are no very large or very small numbers.

    Bad sklearn! NO biscuit!

提交回复
热议问题