I have the following DataFrame:
df = pd.DataFrame({
\'Branch\' : \'A A A A A B\'.split(),
\'Buyer\': \'Carl Mark Carl Joe Joe Carl\'.split(),
\'Quantity\': [
You can now use a TimeGrouper with another column (as of IIRC pandas version 0.14):
In [11]: df1 = df.set_index('Date')
In [12]: g = df1.groupby([pd.TimeGrouper('20D'), 'Branch'])
In [13]: g.sum()
Out[13]:
Quantity
Date Branch
2013-01-01 13:00:00 A 4
2013-09-18 13:00:00 A 13
2013-11-17 13:00:00 A 9
B 3
From the discussion here: https://github.com/pydata/pandas/issues/3791
In [38]: df.set_index('Date').groupby(pd.TimeGrouper('6M')).apply(lambda x: x.groupby('Branch').sum())
Out[38]:
Quantity
Branch
2013-01-31 A 4
2014-01-31 A 22
B 3
And a bit more complicated question
In [55]: def testf(df):
....: if (df['Buyer'] == 'Mark').sum() > 0:
....: return Series(dict(quantity = df['Quantity'].sum(), buyer = 'mark'))
....: return Series(dict(quantity = df['Quantity'].sum()*100, buyer = 'other'))
....:
In [56]: df.set_index('Date').groupby(pd.TimeGrouper('6M')).apply(lambda x: x.groupby('Branch').apply(testf))
Out[56]:
buyer quantity
Branch
2013-01-31 A mark 4
2014-01-31 A other 2200
B other 300