Rolling a function on a data frame

后端 未结 1 1017
抹茶落季
抹茶落季 2021-01-05 14:15

I have the following data frame C.

>>> C
              a    b   c
2011-01-01    0    0 NaN
2011-01-02   41   12 NaN
2011-01-03   82   2         


        
1条回答
  •  逝去的感伤
    2021-01-05 14:43

    You could use pd.rolling_apply:

    import numpy as np
    import pandas as pd
    df = pd.read_table('data', sep='\s+')
    
    def foo(x, df):
        window = df.iloc[x]
        # print(window)
        c = df.ix[int(x[-1]), 'c']
        dvals = window['a'] + window['b']*c
        return bar(dvals)
    
    def bar(dvals):
        # print(dvals)
        return dvals.mean()
    
    df['e'] = pd.rolling_apply(np.arange(len(df)), 6, foo, args=(df,))
    print(df)
    

    yields

                  a    b   c       e
    2011-01-01    0    0 NaN     NaN
    2011-01-02   41   12 NaN     NaN
    2011-01-03   82   24 NaN     NaN
    2011-01-04  123   36 NaN     NaN
    2011-01-05  164   48 NaN     NaN
    2011-01-06  205   60   2   162.5
    2011-01-07  246   72   4   311.5
    2011-01-08  287   84   6   508.5
    2011-01-09  328   96   8   753.5
    2011-01-10  369  108  10  1046.5
    

    The args and kwargs parameters were added to rolling_apply in Pandas version 0.14.0.

    Since in my example above df is a global variable, it is not really necessary to pass it to foo as an argument. You could simply remove df from the def foo line and also omit the args=(df,) in the call to rolling_apply.

    However, there are times when df might not be defined in a scope accessible by foo. In that case, there is a simple workaround -- make a closure:

    def foo(df):
        def inner_foo(x):
            window = df.iloc[x]
            # print(window)
            c = df.ix[int(x[-1]), 'c']
            dvals = window['a'] + window['b']*c
            return bar(dvals)
        return inner_foo
    
    df['e'] = pd.rolling_apply(np.arange(len(df)), 6, foo(df))
    

    0 讨论(0)
提交回复
热议问题