I\'m trying to selection objects within a region from a pandas dataframe which contains a list of item ids and lat lon pairs. Is there a selection method for this? I think t
A process to select points within a region as performed by the working code below starts with creating 2 geodataframes. The first one contains a polygon, and the second contains all the points to do spatial join
with the first. The spatial join operator within
is used to enable the points that fall inside the polygon to be selected. The result of the operation is also a geodataframe, it contains only the required points that fall within the area of the polygon.
The content of locations.csv
; 6 lines with column headers.
Note: no spaces in the first row.
ID,LAT,LON
1, 15.1, 10.0
2, 15.2, 15.1
3, 15.3, 20.2
4, 15.4, 25.3
5, 15.5, 30.4
The code:
import pandas as pd
import geopandas as gpd
from shapely import wkt
from shapely.geometry import Point, Polygon
from shapely.wkt import loads
# Create a geo-dataframe `polygon_df` having 1 row of polygon
# This polygon will be used to select points in a geodataframe
d = {'poly_id':[1], 'wkt':['POLYGON ((30 10, 40 40, 20 40, 10 20, 30 10))']}
df = pd.DataFrame( data=d )
geometry = [loads(pgon) for pgon in df.wkt]
polygon_df = gpd.GeoDataFrame(df, \
crs={'init': 'epsg:4326'}, \
geometry=geometry)
# One can plot this polygon with the command:
# polygon_df.plot()
# Read the file with `pandas`
locs = pd.read_csv('locations.csv', sep=',')
# Making it a geo-dataframe with new name: `geo_locs`
geo_locs = gpd.GeoDataFrame(locs, crs={'init': 'epsg:4326'})
locs_geom = [Point(xy) for xy in zip(geo_locs.LON, geo_locs.LAT)]
geo_locs['wkt'] = geo_locs.apply( lambda x: Point(x.LON, x.LAT), axis=1 )
geo_locs = gpd.GeoDataFrame(geo_locs, crs={'init': 'epsg:4326'}, \
geometry=geo_locs['wkt'])
# Do a spatial join of `point` within `polygon`, get the result in `pts_in_poly` GeodataFrame.
pts_in_poly = gpd.sjoin(geo_locs, polygon_df, op='within', how='inner')
# Print the ID of the points that fall within the polygon.
print(pts_in_poly.ID)
# The output will be:
#2 3
#3 4
#4 5
#Name: ID, dtype: int64
# Plot the polygon and all the points.
ax1 = polygon_df.plot(color='lightgray', zorder=1)
geo_locs.plot(ax=ax1, zorder=5, color="red")
The output plot:
In the plot, the points with ID's 3, 4, and 5 fall within the polygon.