问题
I have a Pandas dataframe with two columns which are speed
and time
.
speed date
54.72 1:33:56
49.37 1:33:59
37.03 1:34:03
24.02 7:39:58
28.02 7:40:01
24.04 7:40:04
24.02 7:40:07
25.35 7:40:10
26.69 7:40:13
32.04 7:40:16
28.02 11:05:43
30.71 11:05:46
29.36 11:05:49
18.68 11:05:52
54.72 11:05:55
34.69 10:31:34
25.03 10:31:38
56.04 10:31:40
44.03 10:31:43
I want to calculate the average of speeds per bins of 30 minutes. For example, the average speed during the 4th bin (1:31 - 2:00) is (54.72 + 49.37 + 37.03)/3. I have thought of converting hours, minutes and seconds to seconds from 00:00 and then have bins of 1800 seconds. I have tried to do use binned_statistic from scipy.stats but my main issue is that I cannot find a way to separate bins based on date and get the average of speeds.
Any ideas?
回答1:
Converting to datetime and using pandas.Grouper
+ Offset Aliases:
df['date'] = pd.to_datetime(df.date)
df.groupby(pd.Grouper(key='date', freq='30min')).mean().dropna()
speed
date
2018-09-20 01:30:00 47.040000
2018-09-20 07:30:00 26.311429
2018-09-20 10:30:00 39.947500
2018-09-20 11:00:00 32.298000
回答2:
Since your date
column isn't really a date, it's probably more sensible to convert it to a timedelta
that way you don't have a date attached to it.
Then, you can use dt.floor
to group into 30 minute bins.
import pandas as pd
df['date'] = pd.to_timedelta(df.date)
df.groupby(df.date.dt.floor('30min')).mean()
Output:
speed
date
01:30:00 47.040000
07:30:00 26.311429
10:30:00 39.947500
11:00:00 32.298000
来源:https://stackoverflow.com/questions/52426972/pandas-bin-dates-into-30-minute-intervals-and-calculate-averages