pandas apply function to multiple columns and multiple rows

前端 未结 1 461
一整个雨季
一整个雨季 2021-02-15 05:49

I have a dataframe with consecutive pixel coordinates in rows and columns \'xpos\', \'ypos\', and I want to calculate the angle in degrees of each path between consecutive pixel

相关标签:
1条回答
  • 2021-02-15 06:05

    You can do this via the following method and I compared the pandas way to your way and it is over 1000 times faster, and that is without adding the list back as a new column! This was done on a 10000 row dataframe

    In [108]:
    
    %%timeit
    import numpy as np
    df['angle'] = np.abs(180/math.pi * np.arctan(df['xpos'].shift() - df['xpos']/df['ypos'].shift() - df['ypos']))
    
    1000 loops, best of 3: 1.27 ms per loop
    
    In [100]:
    
    %%timeit
    temp_list=[]
    for count, row in df.iterrows():
        x1 = row['xpos']
        y1 = row['ypos']
        try:
            x2 = df['xpos'].ix[count-1]
            y2 = df['ypos'].ix[count-1]
            a = abs(180/math.pi * math.atan((y2-y1)/(x2-x1)))
            temp_list.append(a)
        except KeyError:
            temp_list.append(np.nan)
    1 loops, best of 3: 1.29 s per loop
    

    Also if possible avoid using apply, as this operates row-wise, if you can find a vectorised method that can work on the entire series or dataframe then always prefer this.

    UPDATE

    seeing as you are just doing a subtraction from the previous row there is built in method for this diff this results in even faster code:

    In [117]:
    
    %%timeit
    import numpy as np
    df['angle'] = np.abs(180/math.pi * np.arctan(df['xpos'].diff(1)/df['ypos'].diff(1)))
    
    1000 loops, best of 3: 1.01 ms per loop
    

    Another update

    There is also a build in method for series and dataframe division, this now shaves more time off and I achieve sub 1ms time:

    In [9]:
    
    %%timeit
    import numpy as np
    df['angle'] = np.abs(180/math.pi * np.arctan(df['xpos'].diff(1).div(df['ypos'].diff(1))))
    
    1000 loops, best of 3: 951 µs per loop
    
    0 讨论(0)
提交回复
热议问题