How to add an extra column that is the cumulative value of the time differences for each course? For example, the initial table is:
id_A course we
Use groupby
, transform
, and .iloc
:
df['ts_A'] = pd.to_datetime(df.ts_A)
df['cum_delta_sec'] = (df.groupby('id_A')['ts_A']
.transform(lambda x: (x - x.iloc[0]).dt.total_seconds()))
Output:
id_A course weight ts_A value cum_delta_sec
0 id1 cotton 3.5 2017-04-27 01:35:30 150.000000 0
1 id1 cotton 3.5 2017-04-27 01:36:00 416.666667 30
2 id1 cotton 3.5 2017-04-27 01:36:30 700.000000 60
3 id1 cotton 3.5 2017-04-27 01:37:00 950.000000 90
4 id2 cotton blue 5.0 2017-04-27 02:35:30 150.000000 0
5 id2 cotton blue 5.0 2017-04-27 02:36:00 450.000000 30
6 id2 cotton blue 5.0 2017-04-27 02:36:30 520.666667 60
7 id2 cotton blue 5.0 2017-04-27 02:37:00 610.000000 90
In the group, subtract current value from the first value and use .dt
accessor to convert to seconds.