I have a dataframe with consecutive pixel coordinates in rows and columns \'xpos\', \'ypos\', and I want to calculate the angle in degrees of each path between consecutive pixel
You can do this via the following method and I compared the pandas way to your way and it is over 1000 times faster, and that is without adding the list back as a new column! This was done on a 10000 row dataframe
In [108]:
%%timeit
import numpy as np
df['angle'] = np.abs(180/math.pi * np.arctan(df['xpos'].shift() - df['xpos']/df['ypos'].shift() - df['ypos']))
1000 loops, best of 3: 1.27 ms per loop
In [100]:
%%timeit
temp_list=[]
for count, row in df.iterrows():
x1 = row['xpos']
y1 = row['ypos']
try:
x2 = df['xpos'].ix[count-1]
y2 = df['ypos'].ix[count-1]
a = abs(180/math.pi * math.atan((y2-y1)/(x2-x1)))
temp_list.append(a)
except KeyError:
temp_list.append(np.nan)
1 loops, best of 3: 1.29 s per loop
Also if possible avoid using apply
, as this operates row-wise, if you can find a vectorised method that can work on the entire series or dataframe then always prefer this.
UPDATE
seeing as you are just doing a subtraction from the previous row there is built in method for this diff
this results in even faster code:
In [117]:
%%timeit
import numpy as np
df['angle'] = np.abs(180/math.pi * np.arctan(df['xpos'].diff(1)/df['ypos'].diff(1)))
1000 loops, best of 3: 1.01 ms per loop
Another update
There is also a build in method for series and dataframe division, this now shaves more time off and I achieve sub 1ms time:
In [9]:
%%timeit
import numpy as np
df['angle'] = np.abs(180/math.pi * np.arctan(df['xpos'].diff(1).div(df['ypos'].diff(1))))
1000 loops, best of 3: 951 µs per loop