Does pandas iterrows have performance issues?

前端 未结 6 1710
名媛妹妹
名媛妹妹 2020-11-21 07:04

I have noticed very poor performance when using iterrows from pandas.

Is this something that is experienced by others? Is it specific to iterrows and should this fun

6条回答
  •  感动是毒
    2020-11-21 07:44

    Here's the way to do your problem. This is all vectorized.

    In [58]: df = table1.merge(table2,on='letter')
    
    In [59]: df['calc'] = df['number1']*df['number2']
    
    In [60]: df
    Out[60]: 
      letter  number1  number2  calc
    0      a       50      0.2    10
    1      a       50      0.5    25
    2      b      -10      0.1    -1
    3      b      -10      0.4    -4
    
    In [61]: df.groupby('letter')['calc'].max()
    Out[61]: 
    letter
    a         25
    b         -1
    Name: calc, dtype: float64
    
    In [62]: df.groupby('letter')['calc'].idxmax()
    Out[62]: 
    letter
    a         1
    b         2
    Name: calc, dtype: int64
    
    In [63]: df.loc[df.groupby('letter')['calc'].idxmax()]
    Out[63]: 
      letter  number1  number2  calc
    1      a       50      0.5    25
    2      b      -10      0.1    -1
    

提交回复
热议问题