问题
I have a two data frames:
points
contains a series of points withx, y
coordinates.poly
contains coordinates of two polygons (I have over 100 in reality, but keeping it simple here).
I want to be able to add to the dataframe points
an extra column called Area
which contains the name of the polygon the point is in.
poly <- data.frame(
pol= c("P1", "P1","P1","P1","P1","P2","P2","P2","P2", "P2"),
x=c(4360, 7273, 7759, 4440, 4360, 8720,11959, 11440,8200, 8720),
y=c(1009, 9900,28559,28430,1009,9870,9740,28500,28040,9870))
points <- data.frame(
object = c("P1", "P1","P1","P2","P2","P2"),
timestamp= c(1485670023468,1485670023970, 1485670024565, 1485670025756,1485670045062, 1485670047366),
x=c(6000, 6000, 6050, 10000, 10300, 8000),
y=c(10000, 20000,2000,5000,20000,2000))
plot(poly$x, poly$y, type = 'l')
text(points$x, points$y, labels=points$object )
So essentially in this example the first 2 rows should have Area= "P1"
while the last point should be blank as the point is not contained in any polygon.
I have tried using the function in.out
but haven't been able to build my data frame as I described.
Any help is very appreciated!
回答1:
Although this is using a for
loop, it is practically quite fast.
library(mgcv)
x <- split(poly$x, poly$pol)
y <- split(poly$y, poly$pol)
todo <- 1:nrow(points)
Area <- rep.int("", nrow(points))
pol <- names(x)
# loop through polygons
for (i in 1:length(x)) {
# the vertices of i-th polygon
bnd <- cbind(x[[i]], y[[i]])
# points to allocate
xy <- with(points, cbind(x[todo], y[todo]))
inbnd <- in.out(bnd, xy)
# allocation
Area[todo[inbnd]] <- pol[i]
# update 'todo'
todo <- todo[!inbnd]
}
points$Area <- Area
Two reasons for its efficiency:
for
loop is through the polygons, not points. So if you have 100 polygons and 100000 points to allocate, the loop only has 100 iterations not 100000. Inside each iteration, the vectorization power of C functionin.out
is exploited;- It works in a progressive way. Once a point has been allocated, it will be excluded from allocation later.
todo
variable controls the points to allocate through the loop. As it goes, the working set is reducing.
回答2:
You could use the function point.in.polygon
from package sp
:
points$Area = apply(points, 1, function(p)ifelse(point.in.polygon(p[3], p[4], poly$x[which(poly$pol==p[1])], poly$y[which(poly$pol==p[1])]), p[1], NA))
gives you
object timestamp x y Area
1 P1 1.48567e+12 6000 10000 P1
2 P1 1.48567e+12 6000 20000 P1
3 P1 1.48567e+12 6050 2000 <NA>
4 P2 1.48567e+12 10000 5000 <NA>
5 P2 1.48567e+12 10300 20000 P2
6 P2 1.48567e+12 8000 2000 <NA>
来源:https://stackoverflow.com/questions/44020974/assign-polygon-to-data-point-in-r-dataframe