How do i select objects within a geographic regions in a pandas dataframe

前端 未结 1 1194
你的背包
你的背包 2021-01-17 04:46

I\'m trying to selection objects within a region from a pandas dataframe which contains a list of item ids and lat lon pairs. Is there a selection method for this? I think t

相关标签:
1条回答
  • 2021-01-17 04:55

    A process to select points within a region as performed by the working code below starts with creating 2 geodataframes. The first one contains a polygon, and the second contains all the points to do spatial join with the first. The spatial join operator within is used to enable the points that fall inside the polygon to be selected. The result of the operation is also a geodataframe, it contains only the required points that fall within the area of the polygon.

    The content of locations.csv; 6 lines with column headers. Note: no spaces in the first row.

    ID,LAT,LON
    1, 15.1, 10.0
    2, 15.2, 15.1
    3, 15.3, 20.2
    4, 15.4, 25.3
    5, 15.5, 30.4
    

    The code:

    import pandas as pd
    import geopandas as gpd
    from shapely import wkt
    from shapely.geometry import Point, Polygon
    from shapely.wkt import loads
    
    # Create a geo-dataframe `polygon_df` having 1 row of polygon
    # This polygon will be used to select points in a geodataframe
    d = {'poly_id':[1], 'wkt':['POLYGON ((30 10, 40 40, 20 40, 10 20, 30 10))']}
    df = pd.DataFrame( data=d )
    geometry = [loads(pgon) for pgon in df.wkt]
    polygon_df = gpd.GeoDataFrame(df, \
                       crs={'init': 'epsg:4326'}, \
                       geometry=geometry)
    
    # One can plot this polygon with the command:
    # polygon_df.plot()
    
    # Read the file with `pandas`
    locs = pd.read_csv('locations.csv', sep=',')
    
    # Making it a geo-dataframe with new name: `geo_locs`
    geo_locs = gpd.GeoDataFrame(locs, crs={'init': 'epsg:4326'})
    locs_geom = [Point(xy) for xy in zip(geo_locs.LON, geo_locs.LAT)]
    geo_locs['wkt'] = geo_locs.apply( lambda x: Point(x.LON, x.LAT), axis=1 )
    geo_locs = gpd.GeoDataFrame(geo_locs, crs={'init': 'epsg:4326'}, \
        geometry=geo_locs['wkt'])
    
    # Do a spatial join of `point` within `polygon`, get the result in `pts_in_poly` GeodataFrame.
    pts_in_poly = gpd.sjoin(geo_locs, polygon_df, op='within', how='inner')
    
    # Print the ID of the points that fall within the polygon.
    print(pts_in_poly.ID)
    
    # The output will be:
    #2    3
    #3    4
    #4    5
    #Name: ID, dtype: int64
    
    # Plot the polygon and all the points.
    ax1 = polygon_df.plot(color='lightgray', zorder=1)
    geo_locs.plot(ax=ax1, zorder=5, color="red")
    

    The output plot:

    In the plot, the points with ID's 3, 4, and 5 fall within the polygon.

    0 讨论(0)
提交回复
热议问题