I\'m looking for a way to use the find and replace function in R to replace the entire value of a string, rather than just the matching part of the string. I have a dataset with
You can use gsub
as follows:
gsub(".*experiences.*", "exp", string, perl=TRUE)
# As @rawr notes, set perl=TRUE for improved efficiency
This regex matches strings that have any characters 0 or more times (i.e. .*
) followed by "experiences", followed by any characters 0 or more times.
In this case, you are still replacing the entire match with "exp" but by using regex, you expand the definition of the match (from "experience" to ".*experience.*") to achieve the desired substitution.
You can also simply use gsub()
and add .*
before and after the pattern like this:
string<-"TransRights"
gsub(".*sR.*","HumanRights",string)
The outcome would be
HumanRights
There's no need to modify the string with gsub
since you know the desired value ("exp").
s = c(string,"bah","egad.experiences")
replace(s,grep("experiences",s),"exp")
# [1] "exp" "bah" "exp"
Speed. This is a little faster than the string modification in other@Frank's answer.
(Thanks to @rawr for pointing out that we should both turn on perl parsing.)
ss <- c(replicate(1e6,s))
system.time(replace(ss,grep("experiences",ss,perl=TRUE),"exp"))
# user system elapsed
# 0.6 0.0 0.6
system.time(gsub(".*experiences.*", "exp", ss,perl=TRUE))
# user system elapsed
# 2.39 0.00 2.38
Taking away the replacement operations in each answer, it looks like the different patterns being matched make up most of the difference (contrary to what I had expected, seen in my last edit):
system.time(grep("experiences",ss,perl=TRUE)) # used in my answer
# user system elapsed
# 0.64 0.00 0.64
system.time(grep(".*experiences.*",ss,perl=TRUE)) # used in purple-gravatar @Frank's answer
# user system elapsed
# 1.82 0.00 1.82
gsub()
is used to substitute a particular string with another string. In the above code, if you do the following, your whole string changes to exp
result <- gsub(string, "exp", string)
But, if you use grep()
and replace()
, you will achieve your desired result.
res1 <- grep("pattern",string)
gives you all the lines with the pattern and use this in replace()
.
res_new <- replace(string,res1,"exp")