I have a shapefile that has all the counties for the US, and I am doing a bunch of queries at a lat/lon point and then finding what county the point lies in. Right now I am
Your situation looks like a typical case where spatial joins
are useful. The idea of spatial joins is to merge data using geographic coordinates instead of using attributes.
Three possibilities in geopandas
:
intersects
within
contains
It seems like you want within
, which is possible using the following syntax:
geopandas.sjoin(points, polygons, how="inner", op='within')
Note: You need to have installed rtree
to be able to perform such operations. If you need to install this dependency, use pip
or conda
to install it
As an example, let's plot European cities. The two example datasets are
import geopandas
import matplotlib.pyplot as plt
world = geopandas.read_file(geopandas.datasets.get_path('naturalearth_lowres'))
cities = geopandas.read_file(geopandas.datasets.get_path('naturalearth_cities'))
countries = world[world['continent'] == "Europe"].rename(columns={'name':'country'})
countries.head(2)
pop_est continent country iso_a3 gdp_md_est geometry
18 142257519 Europe Russia RUS 3745000.0 MULTIPOLYGON (((178.725 71.099, 180.000 71.516...
21 5320045 Europe Norway -99 364700.0 MULTIPOLYGON (((15.143 79.674, 15.523 80.016, ...
cities.head(2)
name geometry
0 Vatican City POINT (12.45339 41.90328)
1 San Marino POINT (12.44177 43.93610)
cities
is a worldwide dataset and countries
is an European wide dataset.
Both dataset need to be in the same projection system. If not, use .to_crs
before merging.
data_merged = geopandas.sjoin(cities, countries, how="inner", op='within')
Finally, to see the result let's do a map
f, ax = plt.subplots(1, figsize=(20,10))
data_merged.plot(axes=ax)
countries.plot(axes=ax, alpha=0.25, linewidth=0.1)
plt.show()
and the underlying dataset merges together the information we need
data_merged.head(5)
name geometry index_right pop_est continent country iso_a3 gdp_md_est
0 Vatican City POINT (12.45339 41.90328) 141 62137802 Europe Italy ITA 2221000.0
1 San Marino POINT (12.44177 43.93610) 141 62137802 Europe Italy ITA 2221000.0
192 Rome POINT (12.48131 41.89790) 141 62137802 Europe Italy ITA 2221000.0
2 Vaduz POINT (9.51667 47.13372) 114 8754413 Europe Austria AUT 416600.0
184 Vienna POINT (16.36469 48.20196) 114 8754413 Europe Austria AUT 416600.0
Here, I used inner
join method but that's a parameter you can change if, for instance, you want to keep all points, including those not within a polygon.