问题
I'm wondering if I have maximized the speed at which a mean of an area buffered around a point in a raster can be extracted.
Can performance be improved any further on these LOCALLY? I use parallel
mclapply
already, and I know I could get further gains by setting up and running this on a cluster (use a cluster or get more cpu's is not the answer I'm looking for).
Replicate some data:
library(raster)
library(parallel)
library(truncnorm)
library(gdalUtils)
library(velox)
library(sf)
ras <- raster(ncol=1000, nrow=1000, xmn=2001476, xmx=11519096, ymn=9087279, ymx=17080719)
ras[]=rtruncnorm(n=ncell(ras),a=0, b=10, mean=5, sd=2)
crs(ras) <- "+proj=utm +zone=51 ellps=WGS84"
writeRaster(ras,"testras_so.tif", format="GTiff")
gdalbuildvrt(gdalfile = "testras_so.tif",
output.vrt = "testvrt_so.vrt")
x1 <- runif(100,2001476,11519096)
y1 <- runif(100, 9087279,17080719)
poly <- st_buffer(st_sfc(st_point(c(x1[1],y1[1]), dim="XY"),crs=32651),200000)
vras <- velox("testvrt_so.vrt")
###########
Tests:
test1
: if have poly and velox rastertest2
: if have to generate buffer but have velox rastertest3
: if have to generate velox from VR (simulating having different rasters) but having the buffertest4
: have to generate both (from VR)test5
: generate velox from tif have buffertest6
: generate both (tif version)
#Test time if have poly and velox raster
test1 <- system.time(mclapply(seq_along(x1), function(x){
vras$extract(poly, fun=function(t) mean(t,na.rm=T))
}))
#Test time if have to generate buffer but have velox raster
test2 <- system.time(mclapply(seq_along(x1), function(x){
vras$extract(st_buffer(st_sfc(st_point(c(x1[x],y1[x]), dim="XY"),crs=32651),200000), fun=function(t) mean(t,na.rm=T))
}))
#Test time if have to generate velox from VR (simulating having different rasters) but having the buffer
test3 <- system.time(mclapply(seq_along(x1), function(x){
velox("testvrt_so.vrt")$extract(poly, fun=function(t) mean(t,na.rm=T))
}))
#Test time if have to generate velox from VR AND generate buffer (simulating a list of rasters with different buffers each)
test4 <- system.time(mclapply(seq_along(x1), function(x){
velox("testvrt_so.vrt")$extract(st_buffer(st_sfc(st_point(c(x1[x],y1[x]), dim="XY"),crs=32651),200000), fun=function(t) mean(t,na.rm=T))
}))
#Test time if have to generate velox from TIF (simulating having different rasters) but having the buffer
test5 <- system.time(mclapply(seq_along(x1), function(x){
velox("testras_so.tif")$extract(poly, fun=function(t) mean(t,na.rm=T))
}))
#Test time if have to generate velox from TIF AND generate buffer (simulating a list of rasters with different buffers each)
test6 <- system.time(mclapply(seq_along(x1), function(x){
velox("testras_so.tif")$extract(st_buffer(st_sfc(st_point(c(x1[x],y1[x]), dim="XY"),crs=32651),200000), fun=function(t) mean(t,na.rm=T))
}))
My results (yours will vary with cores due to mclapply running parallel):
#Test time if have poly and velox raster
> test1
user system elapsed
0.007 0.022 3.417
#Test time if have to generate buffer but have velox raster
> test2
user system elapsed
0.007 0.023 3.540
#Test time if have to generate velox from VR (simulating having different rasters) but having the buffer
> test3
user system elapsed
0.016 0.037 10.366
#Test time if have to generate velox from VR AND generate buffer (simulating a list of rasters with different buffers each)
> test4
user system elapsed
0.017 0.035 10.309
#Test time if have to generate velox from TIF (simulating having different rasters) but having the buffer
> test5
user system elapsed
0.015 0.033 9.258
#Test time if have to generate velox from TIF AND generate buffer (simulating a list of rasters with different buffers each)
> test6
user system elapsed
0.016 0.034 9.347
Can anybody make any suggestions to make this faster or have I maxed local speed here? Thanks!
回答1:
I got a suggestion to pre-crop the raster in velox from @dbaston. This is the fastest way I have found to extract raster so far in R:
If you have the velox raster already this is unbelievably fast (lightning), even if you have to load the buffer in the function (not shown, but it comes out on my system around 0.4 elapsed):
test7_lightning <- system.time(mclapply(seq_along(x1), function(x){
q <- vras$crop(poly);vras$extract(poly, fun=function(t) mean(t,na.rm=T))
}))
> test7_lightning
user system elapsed
0.001 0.005 0.355
Fast even if you have to dynamically load different rasters (simulated loading same raster many times):
test8 <- system.time(mclapply(seq_along(x1), function(x){ ras<-velox("testras_so.tif");ras$crop(poly);ras$extract(poly, fun=function(t) mean(t,na.rm=T)) }))
test9 <- system.time(mclapply(seq_along(x1), function(x){ ras<-velox("testras_so.tif");ras$crop(st_buffer(st_sfc(st_point(c(x1[x],y1[x]), dim="XY"),crs=32651),200000));ras$extract(poly, fun=function(t) mean(t,na.rm=T)) }))
> test8
user system elapsed
0.011 0.016 4.450
> test9
user system elapsed
0.006 0.012 4.333
来源:https://stackoverflow.com/questions/50870080/fastest-way-to-extract-a-raster-in-r-improve-the-time-of-my-reproducible-code