问题
I am trying to create a function which resamples time series data in pandas
. I would like to have the option to specify the type of aggregation that occurs depending on what type of data I am sending through (i.e. for some data, taking the sum of each bin is appropriate, while for others, taking the mean is needed, etc.). For example data like these:
import pandas as pd
import numpy as np
dr = pd.date_range('01-01-2020', '01-03-2020', freq='1H')
df = pd.DataFrame(np.random.rand(len(dr)), index=dr)
I could have a function like this:
def process(df, freq='3H', method='sum'):
r = df.resample(freq)
if method == 'sum':
r = r.sum()
elif method == 'mean':
r = r.mean()
#...
#more options
#...
return r
For a small amount of aggregation methods, this is fine, but seems like it could be tedious if I wanted to select from all of the possible choices.
I was hoping to use getattr
to implement something like this post (under "Putting it to work: generalizing method calls"). However, I can't find a way to do this:
def process2(df, freq='3H', method='sum'):
r = df.resample(freq)
foo = getattr(r, method)
return r.foo()
#fails with:
#AttributeError: 'DatetimeIndexResampler' object has no attribute 'foo'
def process3(df, freq='3H', method='sum'):
r = df.resample(freq)
foo = getattr(r, method)
return foo(r)
#fails with:
#TypeError: __init__() missing 1 required positional argument: 'obj'
I get why process2
fails (calling r.foo()
looks for the method foo()
of r
, not the variable foo
). But I don't think I get why process3
fails.
I know another approach would be to pass functions to the parameter method
, and then apply
those functions on r
. My inclination is that this would be less efficient? And it still doesn't allow me to access the built-in Resample methods directly.
Is there a working, more concise way to achieve this? Thanks!
回答1:
Try .resample().apply(method)
But unless you are planning some more computation inside the function, it will probably be easier to just hard-code this line.
来源:https://stackoverflow.com/questions/63384448/can-i-dynamically-choose-the-method-applied-on-a-pandas-resampler-object