Interpolation on DataFrame in pandas

后端 未结 2 499
梦谈多话
梦谈多话 2020-11-29 00:41

I have a DataFrame, say a volatility surface with index as time and column as strike. How do I do two dimensional interpolation? I can reindex but how do i deal

相关标签:
2条回答
  • 2020-11-29 00:55

    You can use DataFrame.interpolate to get a linear interpolation.

    In : df = pandas.DataFrame(numpy.random.randn(5,3), index=['a','c','d','e','g'])
    
    In : df
    Out:
              0         1         2
    a -1.987879 -2.028572  0.024493
    c  2.092605 -1.429537  0.204811
    d  0.767215  1.077814  0.565666
    e -1.027733  1.330702 -0.490780
    g -1.632493  0.938456  0.492695
    
    In : df2 = df.reindex(['a','b','c','d','e','f','g'])
    
    In : df2
    Out:
              0         1         2
    a -1.987879 -2.028572  0.024493
    b       NaN       NaN       NaN
    c  2.092605 -1.429537  0.204811
    d  0.767215  1.077814  0.565666
    e -1.027733  1.330702 -0.490780
    f       NaN       NaN       NaN
    g -1.632493  0.938456  0.492695
    
    In : df2.interpolate()
    Out:
              0         1         2
    a -1.987879 -2.028572  0.024493
    b  0.052363 -1.729055  0.114652
    c  2.092605 -1.429537  0.204811
    d  0.767215  1.077814  0.565666
    e -1.027733  1.330702 -0.490780
    f -1.330113  1.134579  0.000958
    g -1.632493  0.938456  0.492695
    

    For anything more complex, you need to roll-out your own function that will deal with a Series object and fill NaN values as you like and return another Series object.

    0 讨论(0)
  • 2020-11-29 00:59

    Old thread but thought I would share my solution with 2d extrapolation/interpolation, respecting index values, which also works on demand. Code ended up a bit weird so let me know if there is a better solution:

    import pandas
    from   numpy import nan
    import numpy
    
    dataGrid = pandas.DataFrame({1: {1: 1, 3: 2},
                                 2: {1: 3, 3: 4}})
    
    
    def getExtrapolatedInterpolatedValue(x, y):
        global dataGrid
        if x not in dataGrid.index:
            dataGrid.ix[x] = nan
            dataGrid = dataGrid.sort()
            dataGrid = dataGrid.interpolate(method='index', axis=0).ffill(axis=0).bfill(axis=0)
    
        if y not in dataGrid.columns.values:
            dataGrid = dataGrid.reindex(columns=numpy.append(dataGrid.columns.values, y))
            dataGrid = dataGrid.sort_index(axis=1)
            dataGrid = dataGrid.interpolate(method='index', axis=1).ffill(axis=1).bfill(axis=1)
    
        return dataGrid[y][x]
    
    
    print getExtrapolatedInterpolatedValue(2, 1.4)
    >>2.3
    
    0 讨论(0)
提交回复
热议问题