How to fix “polyfit maybe poorly conditioned” in numpy?

后端 未结 2 1550
醉酒成梦
醉酒成梦 2021-01-18 11:26

I am trying to do a polyfit on a set of data using numpy package.

The following is the code, it can run successfully. The fitted line seems to fit the data when the o

相关标签:
2条回答
  • 2021-01-18 11:37

    You will get a better fit if you use the Polynomial class, although if you go past the ends of the data with a high order fit you will see the fast divergence shown above because you are extrapolating. To use the Polynomial class

    from numpy.polynomial import Polynomial as P
    p = P.fit(x, y, order)
    

    You can also experiment with more stable polynomial basis that will be better conditioned at high orders, say 100+, although that is hardly justified with noisy data like you are playing with.

    from numpy.polynomial import Chebyshev as T
    p = T.fit(x, y, order)
    

    You can get in bounds plotting points x, y from the fits like so:

    plot(*p.linspace(500))
    
    0 讨论(0)
  • 2021-01-18 11:39

    TL;DR: In this case the warning means: use a lower order!

    To quote the documentation:

    Note that fitting polynomial coefficients is inherently badly conditioned when the degree of the polynomial is large or the interval of sample points is badly centered. The quality of the fit should always be checked in these cases. When polynomial fits are not satisfactory, splines may be a good alternative.

    In other words, the warning tells you to double-check the results. If they seem fine don't worry. But are they fine? To answer that you should evaluate the resulting fit not only on the data points used for fitting (these often match rather well, especially when overfitting). Consider this:

    xp = np.linspace(-1, 1, 10000) * 2 * np.pi
    
    for n in range(3):
        for k in range(3):
    
            order = 20*n+10*k+1
            print(order)
            z = np.polyfit(x,y,order)
            p = np.poly1d(z)
    
            ax[n,k].scatter(x,y,label = "Real data",s=1)
            ax[n,k].plot(xp,p(xp),label = "Polynomial with order={}".format(order), color='C1')
            ax[n,k].legend()
    

    Here we evaluate the polyfit on points spaced much more finely than the sample data. This is the result:

    You can see that for orders 40 and obove the results really shoot off. This coincides with the warnings I get.

    0 讨论(0)
提交回复
热议问题