How to do a point in polygon query efficiently using geopandas?

谁都会走 提交于 2020-05-23 21:05:26

问题


I have a shapefile that has all the counties for the US, and I am doing a bunch of queries at a lat/lon point and then finding what county the point lies in. Right now I am just looping through all the counties and doing pnt.within(county). This isn't very efficient. Is there a better way to do this?


回答1:


Your situation looks like a typical case where spatial joins are useful. The idea of spatial joins is to merge data using geographic coordinates instead of using attributes.

Three possibilities in geopandas:

  • intersects
  • within
  • contains

It seems like you want within, which is possible using the following syntax:

geopandas.sjoin(points, polygons, how="inner", op='within')

Note: You need to have installed rtree to be able to perform such operations. If you need to install this dependency, use pip or conda to install it

Example

As an example, let's plot European cities. The two example datasets are

import geopandas
import matplotlib.pyplot as plt

world = geopandas.read_file(geopandas.datasets.get_path('naturalearth_lowres'))
cities = geopandas.read_file(geopandas.datasets.get_path('naturalearth_cities'))
countries = world[world['continent'] == "Europe"].rename(columns={'name':'country'})

countries.head(2)
    pop_est     continent   country     iso_a3  gdp_md_est  geometry
18  142257519   Europe  Russia  RUS     3745000.0   MULTIPOLYGON (((178.725 71.099, 180.000 71.516...
21  5320045     Europe  Norway  -99     364700.0    MULTIPOLYGON (((15.143 79.674, 15.523 80.016, ...

cities.head(2)
    name    geometry
0   Vatican City    POINT (12.45339 41.90328)
1   San Marino  POINT (12.44177 43.93610)

cities is a worldwide dataset and countries is an European wide dataset.

Both dataset need to be in the same projection system. If not, use .to_crs before merging.

data_merged = geopandas.sjoin(cities, countries, how="inner", op='within')

Finally, to see the result let's do a map

f, ax = plt.subplots(1, figsize=(20,10))
data_merged.plot(axes=ax)
countries.plot(axes=ax, alpha=0.25, linewidth=0.1)
plt.show()

and the underlying dataset merges together the information we need

data_merged.head(5)

    name    geometry    index_right     pop_est     continent   country     iso_a3  gdp_md_est
0   Vatican City    POINT (12.45339 41.90328)   141     62137802    Europe  Italy   ITA     2221000.0
1   San Marino  POINT (12.44177 43.93610)   141     62137802    Europe  Italy   ITA     2221000.0
192     Rome    POINT (12.48131 41.89790)   141     62137802    Europe  Italy   ITA     2221000.0
2   Vaduz   POINT (9.51667 47.13372)    114     8754413     Europe  Austria     AUT     416600.0
184     Vienna  POINT (16.36469 48.20196)   114     8754413     Europe  Austria     AUT     416600.0

Here, I used inner join method but that's a parameter you can change if, for instance, you want to keep all points, including those not within a polygon.



来源:https://stackoverflow.com/questions/61124544/how-to-do-a-point-in-polygon-query-efficiently-using-geopandas

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!