+= operation with non existing dataframes

前端未结

关注

 2  1637

df_pairs:

city1   city2
0   sfo yyz
1   sfo yvr
2   sfo dfw
3   sfo ewr

output of df_pairs.to_dict(\'records\'):

相关标签:

2条回答

再見小時候

2021-01-26 07:40

Give this a run:

Take out the initial variables, and get rid of the for loop.

a = pd.merge(df_pairs, data_df, left_on='city1', right_on='city', how='left').set_index(['city1', 'city2']) b = pd.merge(df_pairs, data_df, left_on='city2', right_on='city', how='left').set_index(['city1', 'city2']) del a['city'] del b['city']

Now do each calculation once and sum across each row (axis=1)

diff_df = b - a diff_df_sign = np.sign(diff_df) diff_df_sign_pos = diff_df_sign.clip(lower=0).sum(axis=1) diff_df_sign_neg = diff_df_sign.clip(upper=0).sum(axis=1) diff_df_pos = diff_df.clip(lower=0).sum(axis=1) diff_df_neg = diff_df.clip(upper=0).sum(axis=1)

Does this look like this output you want?

city1 city2 sfo yyz 5 yvr 5 dfw 5 ewr 4 dtype: float64 city1 city2 sfo yyz 0 yvr 0 dfw 0 ewr -1 dtype: float64 city1 city2 sfo yyz 45.83 yvr 45.83 dfw 75.38 ewr 19.55 dtype: float64 city1 city2 sfo yyz 0.0 yvr 0.0 dfw 0.0 ewr -1.1 dtype: float64

0 讨论(0)

发布评论:

提交评论

加载中...

花落未央

2021-01-26 07:48

Why don't you simply do this:

df_city1 = pd.merge(df_pairs['city1'], data_df, left_on='city1', right_on='city', how='left') df_city2 = pd.merge(df_pairs['city2'], data_df, left_on='city2', right_on='city', how='left') diff = df_city2.subtract(df_city1, fill_value=0) pos_sum = diff[diff >= 0].sum(axis=1) neg_sum = diff[diff < 0].sum(axis=1)

Instead of looping over all your columns, merging 2*(number of columns) times, not to mention indexing, then that complicated bit with np.sign and .clip... Your df_pairs and data_df have a one-to-one correspondence, right?

0 讨论(0)

发布评论:

提交评论

加载中...

验证码

看不清?

提交回复