Groupby first two earliest dates, then average time between first two dates - pandas

后端 未结 3 741
闹比i
闹比i 2021-01-27 17:43

I\'m hoping to groupby users and find the first two uploads. I\'ve figured out how to get the first date via minimum, but I\'m having trouble getting that second upload date. Th

3条回答
  •  一向
    一向 (楼主)
    2021-01-27 18:26

    sort, calculate the difference and then groupby + nth(1) to get the difference between the first uploads, if it exists (users with 1 date will not show up).

    import pandas as pd
    
    df['Date_Uploaded'] = pd.to_datetime(df.Date_Uploaded)
    df = df.sort_values(['User_ID', 'Date_Uploaded'])
    
    df.Date_Uploaded.diff().groupby(df.User_ID).nth(1)
    
    #User_ID
    #abc123   36 days
    #efg123    7 days
    #Name: Date_Uploaded, dtype: timedelta64[ns]
    

    If you just want the average then average that series:

    df.Date_Uploaded.diff().groupby(df.User_ID).nth(1).mean()
    #Timedelta('21 days 12:00:00')
    

提交回复
热议问题