问题
| Month | day | hour | Temperature |
|-----------|-----|------|-------------|
| September | 01 | 0:00 | 19,11 |
| September | 01 | 1:00 | 18,67 |
| September | 01 | 2:00 | 18,22 |
| September | 01 | 3:00 | 17,77 |
convert to:
| Month | day | hour | Temperature |
|-----------|-----|------|-------------|
| September | 01 | 0:00 | T = 19,11 |
| September | 01 | 0:15 | T2 = T + (18,67 - 19,11)/ 4 |
| September | 01 | 0:30 | T3 = T2 + (18,67 - 19,11)/4 |
| September | 01 | 0:45 | T4 = T3 + (18,67 - 19,11)/4 |
| September | 01 | 1:00 | T = 18,67 |
| September | 01 | 1:15 | T2 = T + (18,22 - 18,67)/ 4 |
| September | 01 | 1:30 | T3 = T2 + (18,22 - 18,67)/4 |
| September | 01 | 1:45 | T4 = T3 + (18,22 - 18,67)/4 |
| September | 01 | 2:00 | T = 18,22 |
. . .
I have this in an excel file and wanted to make these changes in python. Initially I upload the dataset to a dataframe. Someone can help me?
回答1:
I will give you a sample code:
x = df.Temperature.str.split(",", expand=True)
x:
0 1
0 19 11
1 18 67
2 18 22
3 17 77
y = x[0].astype(int).diff().div(4).fillna(x.iloc[0,0]).astype(float).cumsum()
y:
0 19.00
1 18.75
2 18.75
3 18.50
Name: 0, dtype: float64
Do it for other column as well and then merge them together to get "<num1>, <num2>"
1st stage: resample:
df[['temp1', 'temp2']] = df.Temperature.str.split(",", expand=True)
df['temp1'] = df['temp1'].astype(int)
df['temp2'] = df['temp2'].astype(int)
u = pd.to_datetime(df['hour'], format='%H:%M')#.dt.hour
df['hr'] = u.dt.hour
df = df.set_index(u)
df1 = df.resample('900s').pad()
df1:
2nd Stage
<to be continued>
Edit2:
df['hour'] = pd.to_datetime(df['hour'], format='%H:%M')
df.set_index('hour', inplace=True)
v = df.resample('15T').bfill().reset_index()
v[['temp1', 'temp2']] = v.Temperature.str.split(",", expand=True)
v['temp1'] = v['temp1'].astype(int)
v['temp2'] = v['temp2'].astype(int)
t = v.groupby(v['hour'].dt.hour)
def calc(val1, val2):
diff1 = (val1['temp1']-val2['temp1'])
diff1.iloc[0]= val1['temp1'].iloc[0]*4
diff2 = (val1['temp2']-val2['temp2'])
diff2.iloc[0]= val1['temp2'].iloc[0]*4
t1_group = diff1.div(4).cumsum()
t2_group = diff2.div(4).cumsum()
return list(zip(t1_group, t2_group))
concat_res = []
for _, gr in t:
concat_res.append(calc(gr, gr.iloc[0]))
flatten = lambda t: [item for sublist in t for item in sublist]
v['Temperature'] = flatten(concat_res)
v = v.drop(['temp1', 'temp2'],axis=1)
v:
来源:https://stackoverflow.com/questions/65395578/how-to-divide-hours-in-15m-intervals-and-distribute-the-values-of-each-hour-in