问题
I have to find which points are inside a grid of square cells, given the points coordinates and the coordinates of the bounds of the cells, through two pandas dataframes. I'm calling dfc the dataframe containing the code and the boundary coordinates of the cells (I simplify the problem, in the real analysis I have a big grid with geographical points and tons of points to check):
Code,minx,miny,maxx,maxy
01,0.0,0.0,2.0,2.0
02,2.0,2.0,3.0,3.0
and dfp the dataframe containing an Id and the coordinates of the points:
Id,x,y
0,1.5,1.5
1,1.1,1.1
2,2.2,2.2
3,1.3,1.3
4,3.4,1.4
5,2.0,1.5
Now I would like to perform a search returning in dfc dataframe a new column (called 'GridCode') of the grid in which the point is in. The cells should be perfectly squared, so I would like to perform an analysis through:
a = np.where(
(dfp['x'] > dfc['minx']) &
(dfp['x'] < dfc['maxx']) &
(dfp['y'] > dfc['miny']) &
(dfp['y'] < dfc['maxy']),
r2['Code'],
'na')
avoiding several loops on the dataframes. The lenghts of the dataframes are not the same. The resulting dataframe should be as follows:
Id x y GridCode
0 0 1.5 1.5 01
1 1 1.1 1.1 01
2 2 2.2 2.2 02
3 3 1.3 1.3 01
4 4 3.4 1.4 na
5 5 2.0 1.5 na
Thanks in advance for your help!
回答1:
Probably a better way, but since this has been sitting out there for awhile..
Using Pandas boolean indexing to filter the dfc data frame instead of np.where()
def findGrid(dfp):
c = dfc[(dfp['x'] > dfc['minx']) &
(dfp['x'] < dfc['maxx']) &
(dfp['y'] > dfc['miny']) &
(dfp['y'] < dfc['maxy'])].Code
if len(c) == 0:
return None
else:
return c.iat[0]
Then use the pandas apply() function
dfp['GridCode'] = dfp.apply(findGrid,axis=1)
Will yield this
Id x y GridCode
0 0 1.5 1.5 1
1 1 1.1 1.1 1
2 2 2.2 2.2 2
3 3 1.3 1.3 1
4 4 3.4 1.4 NaN
5 5 2.0 1.5 NaN
来源:https://stackoverflow.com/questions/27464394/find-points-in-cells-through-pandas-dataframes-of-coordinates