group by rank continuous date by pandas

后端 未结 2 760
萌比男神i
萌比男神i 2021-01-25 16:12

I refer this post . But My goal is something different.

Example

ID    TIME
01    2018-07-11
01    2018-07-12
01    2018-07-13
01    2018-0         


        
相关标签:
2条回答
  • 2021-01-25 16:31

    Use DataFrameGroupBy.diff for difference per groups of TIME column, compare days for not equal 1 and create groups by cumulative sums, last pass to GroupBy.cumcount:

    df['TIME'] = pd.to_datetime(df['TIME'])
    
    new = df.groupby('ID', group_keys=False)['TIME'].diff().dt.days.ne(1).cumsum()
    df['rank'] = df.groupby(['ID',new]).cumcount().add(1)
    print (df)
       ID       TIME  rank
    0   1 2018-07-11     1
    1   1 2018-07-12     2
    2   1 2018-07-13     3
    3   1 2018-07-15     1
    4   1 2018-07-16     2
    5   1 2018-07-17     3
    6   2 2019-09-11     1
    7   2 2019-09-12     2
    8   2 2019-09-15     1
    9   2 2019-09-16     2
    
    0 讨论(0)
  • 2021-01-25 16:38

    First we check the difference between the dates, which are > 1 day. Then we groupby on ID and the cumsum of these differences and cumulative count each group`

    # df['TIME'] = pd.to_datetime(df['TIME'])
    s = df['TIME'].diff().fillna(pd.Timedelta(days=1)).ne(pd.Timedelta(days=1))
    df['RANK'] = s.groupby([df['ID'], s.cumsum()]).cumcount().add(1)
    
    
       ID       TIME  RANK
    0   1 2018-07-11     1
    1   1 2018-07-12     2
    2   1 2018-07-13     3
    3   1 2018-07-15     1
    4   1 2018-07-16     2
    5   1 2018-07-17     3
    6   2 2019-09-11     1
    7   2 2019-09-12     2
    8   2 2019-09-15     1
    9   2 2019-09-16     2
    
    0 讨论(0)
提交回复
热议问题