Fastest way to determine COUNTRY from millions of GPS coordinates [R]

给你一囗甜甜゛ 提交于 2019-12-06 06:41:07

问题


I have millions of GPS coordinates and want to quickly add a column of the country of the coordinates.

My current method works but is extremely slow:

library(data.table)

#REPRODUCE DATA
data <- data.table(latitude=sample(seq(47,52,by=0.001), 1000000, replace = TRUE),
                   longitude=sample(seq(8,23,by=0.001), 1000000, replace = TRUE))

#REQUIRED PACKAGES
if (!require("sp")) install.packages("sp")
if (!require("rworldmap")) install.packages("rworldmap")
if (!require("sf")) install.packages("sf")
library(sp)
library(rworldmap)
library(sf)

#CURRENT SLOW FUNCTION
coords2country = function(points,latcol,loncol){  
  countriesSP <- getMap(resolution='low')
  pointsSP <- st_as_sf(points,coords=c(loncol,latcol),crs=4326)
  pointsSP<- as(pointsSP,"Spatial")
  # use 'over' to get indices of the Polygons object containing each point 
  indices = over(pointsSP, countriesSP)
  # return the ADMIN names of each country
  indices$ADMIN  
  #indices$ISO3 # returns the ISO3 code 
  #indices$continent   # returns the continent (6 continent model)
  #indices$REGION   # returns the continent (7 continent model)
}

#SLOW!
> system.time(data[,country:=coords2country(data,"latitude","longitude"),])
   user  system elapsed 
121.293   7.849 130.226 

Is there a faster/better way to do this? Thanks!


回答1:


There are two similar questions. They are in my comments above. The questions are asking how to get country names from coordinates. Here the OP is asking which is a faster way to do the task. Based on the posts, we have three options. One is to use the custom function in this question. Another is to use the geonames package. The other is to use map.where() in the map package. The second option needs a bit of setup. So I just tested map.where(). The following is the result. As the OP said, this function is working must faster.

library(maps)
set.seed(111)
data <- data.table(latitude=sample(seq(47,52,by=0.001), 1000000, replace = TRUE),
                   longitude=sample(seq(8,23,by=0.001), 1000000, replace = TRUE))

system.time(data[, country := map.where(x = longitude, y = latitude)])

#   user  system elapsed 
#   7.20    0.05    7.29 


来源:https://stackoverflow.com/questions/52910965/fastest-way-to-determine-country-from-millions-of-gps-coordinates-r

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!