I want to multiply two columns in a pandas DataFrame and add the result into a new column

前端 未结 7 1325
清酒与你
清酒与你 2020-12-02 09:49

I\'m trying to multiply two existing columns in a pandas Dataframe (orders_df) - Prices (stock close price) and Amount (stock quantities) and add the calculation to a new co

相关标签:
7条回答
  • Good solution from bmu. I think it's more readable to put the values inside the parentheses vs outside.

        df['Values'] = np.where(df.Action == 'Sell', 
                                df.Prices*df.Amount, 
                               -df.Prices*df.Amount)
    

    Using some pandas built in functions.

        df['Values'] = np.where(df.Action.eq('Sell'), 
                                df.Prices.mul(df.Amount), 
                               -df.Prices.mul(df.Amount))
    
    0 讨论(0)
  • 2020-12-02 10:09

    To make things neat, I take Hayden's solution but make a small function out of it.

    def create_value(row):
        if row['Action'] == 'Sell':
            return row['Prices'] * row['Amount']
        else:
            return -row['Prices']*row['Amount']
    

    so that when we want to apply the function to our dataframe, we can do..

    df['Value'] = df.apply(lambda row: create_value(row), axis=1)
    

    ...and any modifications only need to occur in the small function itself.

    Concise, Readable, and Neat!

    0 讨论(0)
  • 2020-12-02 10:21

    I think an elegant solution is to use the where method (also see the API docs):

    In [37]: values = df.Prices * df.Amount
    
    In [38]: df['Values'] = values.where(df.Action == 'Sell', other=-values)
    
    In [39]: df
    Out[39]: 
       Prices  Amount Action  Values
    0       3      57   Sell     171
    1      89      42   Sell    3738
    2      45      70    Buy   -3150
    3       6      43   Sell     258
    4      60      47   Sell    2820
    5      19      16    Buy    -304
    6      56      89   Sell    4984
    7       3      28    Buy     -84
    8      56      69   Sell    3864
    9      90      49    Buy   -4410
    

    Further more this should be the fastest solution.

    0 讨论(0)
  • 2020-12-02 10:26

    You can use the DataFrame apply method:

    order_df['Value'] = order_df.apply(lambda row: (row['Prices']*row['Amount']
                                                   if row['Action']=='Sell'
                                                   else -row['Prices']*row['Amount']),
                                       axis=1)
    

    It is usually faster to use these methods rather than over for loops.

    0 讨论(0)
  • 2020-12-02 10:27

    Since this question came up again, I think a good clean approach is using assign.

    The code is quite expressive and self-describing:

    df = df.assign(Value = lambda x: x.Prices * x.Amount * x.Action.replace({'Buy' : 1, 'Sell' : -1}))
    
    0 讨论(0)
  • 2020-12-02 10:28

    If we're willing to sacrifice the succinctness of Hayden's solution, one could also do something like this:

    In [22]: orders_df['C'] = orders_df.Action.apply(
                   lambda x: (1 if x == 'Sell' else -1))
    
    In [23]: orders_df   # New column C represents the sign of the transaction
    Out[23]:
       Prices  Amount Action  C
    0       3      57   Sell  1
    1      89      42   Sell  1
    2      45      70    Buy -1
    3       6      43   Sell  1
    4      60      47   Sell  1
    5      19      16    Buy -1
    6      56      89   Sell  1
    7       3      28    Buy -1
    8      56      69   Sell  1
    9      90      49    Buy -1
    

    Now we have eliminated the need for the if statement. Using DataFrame.apply(), we also do away with the for loop. As Hayden noted, vectorized operations are always faster.

    In [24]: orders_df['Value'] = orders_df.Prices * orders_df.Amount * orders_df.C
    
    In [25]: orders_df   # The resulting dataframe
    Out[25]:
       Prices  Amount Action  C  Value
    0       3      57   Sell  1    171
    1      89      42   Sell  1   3738
    2      45      70    Buy -1  -3150
    3       6      43   Sell  1    258
    4      60      47   Sell  1   2820
    5      19      16    Buy -1   -304
    6      56      89   Sell  1   4984
    7       3      28    Buy -1    -84
    8      56      69   Sell  1   3864
    9      90      49    Buy -1  -4410
    

    This solution takes two lines of code instead of one, but is a bit easier to read. I suspect that the computational costs are similar as well.

    0 讨论(0)
提交回复
热议问题