I refer this post . But My goal is something different.
Example
ID TIME
01 2018-07-11
01 2018-07-12
01 2018-07-13
01 2018-0
Use DataFrameGroupBy.diff for difference per groups of TIME
column, compare days for not equal 1
and create groups by cumulative sums, last pass to GroupBy.cumcount:
df['TIME'] = pd.to_datetime(df['TIME'])
new = df.groupby('ID', group_keys=False)['TIME'].diff().dt.days.ne(1).cumsum()
df['rank'] = df.groupby(['ID',new]).cumcount().add(1)
print (df)
ID TIME rank
0 1 2018-07-11 1
1 1 2018-07-12 2
2 1 2018-07-13 3
3 1 2018-07-15 1
4 1 2018-07-16 2
5 1 2018-07-17 3
6 2 2019-09-11 1
7 2 2019-09-12 2
8 2 2019-09-15 1
9 2 2019-09-16 2
First we check the difference between the dates, which are > 1 day
. Then we groupby on ID
and the cumsum
of these differences and cumulative count
each group`
# df['TIME'] = pd.to_datetime(df['TIME'])
s = df['TIME'].diff().fillna(pd.Timedelta(days=1)).ne(pd.Timedelta(days=1))
df['RANK'] = s.groupby([df['ID'], s.cumsum()]).cumcount().add(1)
ID TIME RANK
0 1 2018-07-11 1
1 1 2018-07-12 2
2 1 2018-07-13 3
3 1 2018-07-15 1
4 1 2018-07-16 2
5 1 2018-07-17 3
6 2 2019-09-11 1
7 2 2019-09-12 2
8 2 2019-09-15 1
9 2 2019-09-16 2