I have a DataFrame
containing a time series:
rng = pd.date_range(\'2016-06-01\', periods=24*7, freq=\'H\')
ones = pd.Series([1]*24*7, rng)
rdf =
Since the question now focuses on grouping by week, you can simply:
rdf.resample('W-{}'.format(rdf.index[-1].strftime('%a')), closed='right', label='right').sum()
You can use loffset
to get it to work - at least for most periods (using .resample()
):
for i in range(2, 7):
print(i)
print(rdf.resample('{}D'.format(i), closed='right', loffset='{}D'.format(i)).sum())
2
a
2016-06-01 24
2016-06-03 48
2016-06-05 48
2016-06-07 48
3
a
2016-06-01 24
2016-06-04 72
2016-06-07 72
4
a
2016-06-01 24
2016-06-05 96
2016-06-09 48
5
a
2016-06-01 24
2016-06-06 120
2016-06-11 24
6
a
2016-06-01 24
2016-06-07 144
However, you could also create custom groupings that calculate the correct values without TimeGrouper
like so:
days = rdf.index.to_series().dt.day.unique()[::-1]
for n in range(2, 7):
chunks = [days[i:i + n] for i in range(0, len(days), n)][::-1]
grp = pd.Series({k: v for d in [zip(chunk, [idx] * len(chunk)) for idx, chunk in enumerate(chunks)] for k, v in d})
rdf.groupby(rdf.index.to_series().dt.day.map(grp))['a'].sum()
2
groups
0 24
1 48
2 48
3 48
Name: a, dtype: int64
3
groups
0 24
1 72
2 72
Name: a, dtype: int64
4
groups
0 72
1 96
Name: a, dtype: int64
5
groups
0 48
1 120
Name: a, dtype: int64
6
groups
0 24
1 144
Name: a, dtype: int64