+= operation with non existing dataframes

前端 未结 2 1635
我寻月下人不归
我寻月下人不归 2021-01-26 07:27

df_pairs:

city1   city2
0   sfo yyz
1   sfo yvr
2   sfo dfw
3   sfo ewr

output of df_pairs.to_dict(\'records\'):

相关标签:
2条回答
  • 2021-01-26 07:40

    Give this a run:

    Take out the initial variables, and get rid of the for loop.

    a = pd.merge(df_pairs, data_df, left_on='city1', right_on='city', how='left').set_index(['city1', 'city2'])
    b = pd.merge(df_pairs, data_df, left_on='city2', right_on='city', how='left').set_index(['city1', 'city2'])
    del a['city']
    del b['city']
    

    Now do each calculation once and sum across each row (axis=1)

    diff_df = b - a
    diff_df_sign = np.sign(diff_df)
    diff_df_sign_pos = diff_df_sign.clip(lower=0).sum(axis=1)
    diff_df_sign_neg = diff_df_sign.clip(upper=0).sum(axis=1)
    diff_df_pos = diff_df.clip(lower=0).sum(axis=1)
    diff_df_neg = diff_df.clip(upper=0).sum(axis=1)
    

    Does this look like this output you want?

    city1  city2
    sfo    yyz      5
           yvr      5
           dfw      5
           ewr      4
    dtype: float64
    
    city1  city2
    sfo    yyz      0
           yvr      0
           dfw      0
           ewr     -1
    dtype: float64
    
    city1  city2
    sfo    yyz      45.83
           yvr      45.83
           dfw      75.38
           ewr      19.55
    dtype: float64
    
    city1  city2
    sfo    yyz      0.0
           yvr      0.0
           dfw      0.0
           ewr     -1.1
    dtype: float64
    
    0 讨论(0)
  • 2021-01-26 07:48

    Why don't you simply do this:

    df_city1 = pd.merge(df_pairs['city1'], data_df, left_on='city1', right_on='city', how='left')
    df_city2 = pd.merge(df_pairs['city2'], data_df, left_on='city2', right_on='city', how='left')
    diff = df_city2.subtract(df_city1, fill_value=0)
    pos_sum = diff[diff >= 0].sum(axis=1)
    neg_sum = diff[diff <  0].sum(axis=1)
    

    Instead of looping over all your columns, merging 2*(number of columns) times, not to mention indexing, then that complicated bit with np.sign and .clip... Your df_pairs and data_df have a one-to-one correspondence, right?

    0 讨论(0)
提交回复
热议问题