Listing number of obervations by location

前端 未结 2 1919
不思量自难忘°
不思量自难忘° 2021-01-26 17:46

Need help here. I am trying to create a new column that will list the number of restaurants with in 200 meters of a restaurant using latitude and longitude. I couldn\'t find any

相关标签:
2条回答
  • 2021-01-26 18:39

    One approach would be to compute the distance matrix, and then to figure out the ones that are sufficiently close (here I demonstrate being within 20 kilometers so the numbers aren't all 0):

    # Load the fields library
    library(fields)
    
    # Create a simple data frame to demonstrate (each row is a restaurant). The rdist.earth function
    # we're about to call takes as input something where the first column is longitude and the second
    # column is latitude.
    df = data.frame(longitude=c(-111.9269, -111.8983, -112.1863, -112.0739, -112.2766, -112.0692),
                    latitude=c(33.46337, 33.62146, 33.65387, 33.44990, 33.56626, 33.48585))
    
    # Let's compute the distance between each restaurant.
    distances = rdist.earth(df, miles=F)
    distances
    
    #          [,1]     [,2]     [,3]         [,4]     [,5]         [,6]
    # [1,]  0.00000 17.79813 32.07533 1.373515e+01 34.41932 1.344867e+01
    # [2,] 17.79813  0.00000 26.93558 2.510519e+01 35.61413 2.189270e+01
    # [3,] 32.07533 26.93558  0.00000 2.498676e+01 12.85352 2.162964e+01
    # [4,] 13.73515 25.10519 24.98676 1.344145e-04 22.84310 4.025824e+00
    # [5,] 34.41932 35.61413 12.85352 2.284310e+01  0.00000 2.122719e+01
    # [6,] 13.44867 21.89270 21.62964 4.025824e+00 21.22719 9.504539e-05
    
    # Compute the number of restaurants within 20 kilometers of the restaurant in each row.
    df$num.close = colSums(distances <= 20) - 1
    df$num.close
    # [1] 3 1 1 2 1 2
    
    0 讨论(0)
  • 2021-01-26 18:40

    Base R and untested code but you should get the idea.

    I'm basically testing how many rows fall within the circle equation x2 + y2 <= R for each restaurant, except for that restaurant itself, and updating that as the value in the column. Note that the radius in my equation is 200 but it will be different because your x,y is in latitude, longitude and you will have to scale the radius of 200 metres to 2pi radians / circumference of earth or 360 degree / circumference of earth.

    df <- data.frame(
      latitude = runif(n=10,min=0,max=1000),
      longitude = runif(n=10,min=0,max=1000)
      )
    
    for (i in seq(nrow(df)))
    {
      # circle's centre
      xcentre <- df[i,'latitude']
      ycentre <- df[i,'longitude']
    
      # checking how many restaurants lie within 200 m of the above centre, noofcloserest column will contain this value
      df[i,'noofcloserest'] <- sum(
        (df[,'latitude'] - xcentre)^2 + 
          (df[,'longitude'] - ycentre)^2 
        <= 200^2
      ) - 1
    
      # logging part for deeper analysis
      cat(i,': ')
      # this prints the true/false vector for which row is within the radius, and which row isn't
      cat((df[,'latitude'] - xcentre)^2 + 
        (df[,'longitude'] - ycentre)^2 
      <= 200^2)
    
      cat('\n')
    
    }
    

    Output -

    1 : TRUE FALSE FALSE TRUE FALSE TRUE FALSE FALSE FALSE FALSE
    2 : FALSE TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
    3 : FALSE FALSE TRUE FALSE TRUE FALSE FALSE FALSE FALSE FALSE
    4 : TRUE FALSE FALSE TRUE FALSE TRUE FALSE FALSE FALSE FALSE
    5 : FALSE FALSE TRUE FALSE TRUE FALSE FALSE TRUE FALSE FALSE
    6 : TRUE FALSE FALSE TRUE FALSE TRUE FALSE FALSE FALSE FALSE
    7 : FALSE FALSE FALSE FALSE FALSE FALSE TRUE FALSE FALSE TRUE
    8 : FALSE FALSE FALSE FALSE TRUE FALSE FALSE TRUE FALSE FALSE
    9 : FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE FALSE
    10 : FALSE FALSE FALSE FALSE FALSE FALSE TRUE FALSE FALSE TRUE
    > df
        latitude longitude noofcloserest
    1  189.38878 270.25004             2
    2  402.36853 879.26657             0
    3  747.46417 581.66627             1
    4  291.64303 157.75450             2
    5  830.10699 736.19586             2
    6  299.06803 157.76147             2
    7  725.68360  58.53049             1
    8  893.31904 772.46217             1
    9   45.47875 701.82201             0
    10 645.44772 226.95042             1
    

    What that output means is that for the coordinates at row 1, three rows are within 200 m. Row 1 itself, and rows 4 and 6.

    0 讨论(0)
提交回复
热议问题