I have the following data frame C
.
>>> C
a b c
2011-01-01 0 0 NaN
2011-01-02 41 12 NaN
2011-01-03 82 2
You could use pd.rolling_apply
:
import numpy as np
import pandas as pd
df = pd.read_table('data', sep='\s+')
def foo(x, df):
window = df.iloc[x]
# print(window)
c = df.ix[int(x[-1]), 'c']
dvals = window['a'] + window['b']*c
return bar(dvals)
def bar(dvals):
# print(dvals)
return dvals.mean()
df['e'] = pd.rolling_apply(np.arange(len(df)), 6, foo, args=(df,))
print(df)
yields
a b c e
2011-01-01 0 0 NaN NaN
2011-01-02 41 12 NaN NaN
2011-01-03 82 24 NaN NaN
2011-01-04 123 36 NaN NaN
2011-01-05 164 48 NaN NaN
2011-01-06 205 60 2 162.5
2011-01-07 246 72 4 311.5
2011-01-08 287 84 6 508.5
2011-01-09 328 96 8 753.5
2011-01-10 369 108 10 1046.5
The args
and kwargs
parameters were added to rolling_apply in Pandas version 0.14.0.
Since in my example above df
is a global variable, it is not really necessary
to pass it to foo
as an argument. You could simply remove df
from the def
foo
line and also omit the args=(df,)
in the call to rolling_apply
.
However, there are times when df
might not be defined in a scope accessible by foo
. In that case, there is a simple workaround -- make a closure:
def foo(df):
def inner_foo(x):
window = df.iloc[x]
# print(window)
c = df.ix[int(x[-1]), 'c']
dvals = window['a'] + window['b']*c
return bar(dvals)
return inner_foo
df['e'] = pd.rolling_apply(np.arange(len(df)), 6, foo(df))