I\'m hoping to groupby users and find the first two uploads. I\'ve figured out how to get the first date via minimum, but I\'m having trouble getting that second upload date. Th
sort
, calculate the difference and then groupby
+ nth(1)
to get the difference between the first uploads, if it exists (users with 1 date will not show up).
import pandas as pd
df['Date_Uploaded'] = pd.to_datetime(df.Date_Uploaded)
df = df.sort_values(['User_ID', 'Date_Uploaded'])
df.Date_Uploaded.diff().groupby(df.User_ID).nth(1)
#User_ID
#abc123 36 days
#efg123 7 days
#Name: Date_Uploaded, dtype: timedelta64[ns]
If you just want the average then average that series:
df.Date_Uploaded.diff().groupby(df.User_ID).nth(1).mean()
#Timedelta('21 days 12:00:00')