问题
I can't seem to find the answer to my question so I'm trying my luck on here. Would very much appreciate your help.
I've got a Pandas dataframe with values in Col1 and Col2. Instead of the np.nan values in Col2, I'd like to calculate the following: today's Col2 value = previous day's Col2 value multiplied by today's Col1 value.
This should be some form of recursive function. I've tried several answers, including a for loop here below, but none seem to work:
df = pd.read_excel('/Users/fhggshgf/Desktop/test.xlsx')
df.index = df.date
df.drop(['date'], axis=1, inplace=True)
for i in range(1, len(df)):
fill_value = df['Col2'].iloc[i - 1]
finaldf['Col2'].fillna(fill_value, inplace=True)
screenshot
回答1:
You could try something like this.
import pandas as pd
import numpy as np
df = pd.DataFrame({'date': [1,2,3,4,5,6],
'col_1': [951, 909, 867, 844, 824, 826],
'col_2': [179, 170, 164, 159, 153, 149]})
col_2_update_list = []
for i, row in df.iterrows():
if i != 0:
today_col_1 = df.at[i,'col_1']
prev_day_col_2 = df.at[i-1,'col_2']
new_col_2_val = prev_day_col_2 * today_col_1
col_2_update_list.append(new_col_2_val)
else:
col_2_update_list.append(np.nan)
df['updated_col_2'] = col_2_update_list
回答2:
This avoids the use of loops but you need to create 2 new columns:
import pandas as pd
import numpy as np
import sys
df = pd.DataFrame({'date': [1,2,3,4,5,6],
'col_1': [951, 909, 867, 844, 824, 826],
'col_2': [179, np.nan, 164, 159, np.nan, 149]})
print(df)
# Compare 2 columns
df['col_4'] = df['col_2'].fillna(method='ffill')*df['col_1']
df['col_3'] = df['col_2'].fillna(sys.maxsize)
df['col_2'] = df[['col_4','col_3']].min(axis=1).astype(int)
df = df.drop(['col_4', 'col_3'], axis = 1)
print(df)
来源:https://stackoverflow.com/questions/59022908/define-recursive-function-in-pandas-dataframe