I use Pandas
a lot and its great. I use TimeGrouper
as well, and its great. I actually dont know where is the documentation about TimeGrouper
pd.TimeGrouper()
was formally deprecated in pandas v0.21.0 in favor of pd.Grouper().
The best use of pd.Grouper()
is within groupby()
when you're also grouping on non-datetime-columns. If you just need to group on a frequency, use resample()
.
For example, say you have:
>>> import pandas as pd
>>> import numpy as np
>>> np.random.seed(444)
>>> df = pd.DataFrame({'a': np.random.choice(['x', 'y'], size=50),
'b': np.random.rand(50)},
index=pd.date_range('2010', periods=50))
>>> df.head()
a b
2010-01-01 y 0.959568
2010-01-02 x 0.784837
2010-01-03 y 0.745148
2010-01-04 x 0.965686
2010-01-05 y 0.654552
You could do:
>>> # `a` is dropped because it is non-numeric
>>> df.groupby(pd.Grouper(freq='M')).sum()
b
2010-01-31 18.5123
2010-02-28 7.7670
But the above is a little unnecessary because you're only grouping on the index. Instead you could do:
>>> df.resample('M').sum()
b
2010-01-31 16.168086
2010-02-28 9.433712
to produce the same result.
Conversely, here's a case where Grouper()
would be useful:
>>> df.groupby([pd.Grouper(freq='M'), 'a']).sum()
b
a
2010-01-31 x 8.9452
y 9.5671
2010-02-28 x 4.2522
y 3.5148
For some more detail, take a look at Chapter 7 of Ted Petrou's Pandas Cookbook.
pandas.TimeGrouper()
was deprecated in favour of pandas.Grouper() in pandas v0.21.
Use pandas.Grouper()
instead.