问题
Trying to avoid using a for
loop in the following code by utilizing sapply
, if at all possible. The solution with loop works perfectly fine for me, I'm just trying to learn more R and explore as many methods as possible.
Objective: have a vector i
and two vectors sf
(search for) and rp
(replace). For each i
need to loop over sf
and replace with rp
where match.
i = c("1 6 5 4","7 4 3 1")
sf = c("1","2","3")
rp = c("one","two","three")
funn <- function(i) {
for (j in seq_along(sf)) i = gsub(sf[j],rp[j],i,fixed=T)
return(i)
}
print(funn(i))
Result (correct):
[1] "one 6 5 4" "7 4 three one"
I'd like to do the very same, but with sapply
#Trying to avoid a for loop in a fun
#funn1 <- function(i) {
# i = gsub(sf,rp,i,fixed=T)
# return(i)
#}
#print(sapply(i,funn1))
Apparently, the above commented code will not work as I can only get the first element of the sf
. This is my first time using sapply
, so I'm not exactly sure how to convert an "inner" implicit loop into a vectorized solution. Any help (even a statement - this is not possible) is appreciated!
(I'm aware of mgsub
but this is not the solution here. Would like to keep gsub
)
EDIT: full code with packages and belowoffered solutions and timing:
#timing
library(microbenchmark)
library(functional)
i = rep(c("1 6 5 4","7 4 3 1"),10000)
sf = rep(c("1","2","3"),100)
rp = rep(c("one","two","three"),100)
#Loop
funn <- function(i) {
for (j in seq_along(sf)) i = gsub(sf[j],rp[j],i,fixed=T)
return(i)
}
t1 = proc.time()
k = funn(i)
t2 = proc.time()
#print(k)
print(microbenchmark(funn(i),times=10))
#mapply
t3 = proc.time()
mapply(function(u,v) i<<-gsub(u,v,i), sf, rp)
t4 = proc.time()
#print(i)
print(microbenchmark(mapply(function(u,v) i<<-gsub(u,v,i), sf, rp),times=10))
#Curry
t5 = proc.time()
Reduce(Compose, Map(function(u,v) Curry(gsub, pattern=u, replacement=v), sf, rp))(i)
t6 = proc.time()
print(microbenchmark(Reduce(Compose, Map(function(u,v) Curry(gsub, pattern=u, replacement=v), sf, rp))(i), times=10))
#4th option
n <- length(sf)
sf <- setNames(sf,1:n)
rp <- setNames(rp,1:n)
t7 = proc.time()
Reduce(function(x,j) gsub(sf[j],rp[j],x,fixed=TRUE),c(list(i),as.list(1:n)))
t8 = proc.time()
print(microbenchmark(Reduce(function(x,j) gsub(sf[j],rp[j],x,fixed=TRUE),c(list(i),as.list(1:n))),times=10))
#Usual proc.time
print(t2-t1)
print(t4-t3)
print(t6-t5)
print(t8-t7)
Times:
Unit: milliseconds
expr min lq mean median uq max neval
funn(i) 143 143 149 145 147 165 10
Unit: seconds
expr min lq mean median uq max neval
mapply(function(u, v) i <<- gsub(u, v, i), sf, rp) 4.1 4.2 4.4 4.3 4.4 4.9 10
Unit: seconds
expr min lq mean median uq max neval
Reduce(Compose, Map(function(u, v) Curry(gsub, pattern = u, replacement = v), sf, rp))(i) 1.6 1.6 1.7 1.7 1.7 1.7 10
Unit: milliseconds
expr min lq mean median uq max neval
Reduce(function(x, j) gsub(sf[j], rp[j], x, fixed = TRUE), c(list(i), as.list(1:n))) 141 144 147 145 146 162 10
user system elapsed
0.15 0.00 0.15
user system elapsed
4.49 0.03 4.52
user system elapsed
1.68 0.02 1.68
user system elapsed
0.19 0.00 0.18
So, indeed in this case the for
loop offers best timing and is (in my opinion) most straightforward, simple, and possibly elegant. Sticking to loop.
Thanks to all. All suggestions accepted and upvoted.
回答1:
One approach - advantage is conciseness but clearly not functional programming oriented - since it has border effect in modifying i
:
mapply(function(u,v) i<<-gsub(u,v,i), sf, rp)
#> i
#[1] "one 6 5 4" "7 4 three one"
Or here is a pure functional programming approach:
library(functional)
Reduce(Compose, Map(function(u,v) Curry(gsub, pattern=u, replacement=v), sf, rp))(i)
#[1] "one 6 5 4" "7 4 three one"
What is does is that Map(function(u,v) Curry(gsub, pattern=u, replacement=v), sf, rp)
builds a list of function which will respectively replace 1
with one
, 2
with two
, etc. Then these functions are composed and applied to i
, giving the desired result.
回答2:
sapply(seq_along(sf),function(x)i<-gsub(sf[x],rp[x],i))
回答3:
This is sequential, so a loop seems natural. Here's a solution that is almost as bad as <<-
:
n <- length(sf)
Reduce(function(x,j) gsub(sf[j],rp[j],x,fixed=TRUE),c(list(i),as.list(1:n)))
# [1] "one 6 5 4" "7 4 three one"
Really, you should use a loop.
来源:https://stackoverflow.com/questions/30241461/trying-to-avoid-for-loop-with-sapply-for-gsub