Interpolating time series in Pandas using Cubic spline

后端 未结 1 1633
一整个雨季
一整个雨季 2020-12-17 03:46

I would like to fill gaps in a column in my DataFrame using a cubic spline. If I were to export to a list then I could use the numpy\'s interp1d function and ap

相关标签:
1条回答
  • 2020-12-17 04:03

    Most numpy/scipy function require the arguments only to be "array_like", iterp1d is no exception. Fortunately both Series and DataFrame are "array_like" so we don't need to leave pandas:

    import pandas as pd
    import numpy as np
    from scipy.interpolate import interp1d
    
    df = pd.DataFrame([np.arange(1, 6), [1, 8, 27, np.nan, 125]]).T
    
    In [5]: df
    Out[5]: 
       0    1
    0  1    1
    1  2    8
    2  3   27
    3  4  NaN
    4  5  125
    
    df2 = df.dropna() # interpolate on the non nan
    f = interp1d(df2[0], df2[1], kind='cubic')
    #f(4) == array(63.9999999999992)
    
    df[1] = df[0].apply(f)
    
    In [10]: df
    Out[10]: 
       0    1
    0  1    1
    1  2    8
    2  3   27
    3  4   64
    4  5  125
    

    Note: I couldn't think of an example off the top of my head to pass in a DataFrame into the second argument (y)... but this ought to work too.

    0 讨论(0)
提交回复
热议问题