group by rank continuous date by pandas

后端 未结 2 762
萌比男神i
萌比男神i 2021-01-25 16:12

I refer this post . But My goal is something different.

Example

ID    TIME
01    2018-07-11
01    2018-07-12
01    2018-07-13
01    2018-0         


        
2条回答
  •  故里飘歌
    2021-01-25 16:31

    Use DataFrameGroupBy.diff for difference per groups of TIME column, compare days for not equal 1 and create groups by cumulative sums, last pass to GroupBy.cumcount:

    df['TIME'] = pd.to_datetime(df['TIME'])
    
    new = df.groupby('ID', group_keys=False)['TIME'].diff().dt.days.ne(1).cumsum()
    df['rank'] = df.groupby(['ID',new]).cumcount().add(1)
    print (df)
       ID       TIME  rank
    0   1 2018-07-11     1
    1   1 2018-07-12     2
    2   1 2018-07-13     3
    3   1 2018-07-15     1
    4   1 2018-07-16     2
    5   1 2018-07-17     3
    6   2 2019-09-11     1
    7   2 2019-09-12     2
    8   2 2019-09-15     1
    9   2 2019-09-16     2
    

提交回复
热议问题