问题
The objective is to parse a regular expression and replace the matched pattern.
Consider this example:
data <- c("cat 6kg","cat g250", "cat dog","cat 10 kg")
I have to locate all occurrences of cat
and a number [0-9]
. To do this:
found <- data[grepl("(^cat.[a-z][0-9])|(^cat.[0-9])",data)]
found
[1] "cat 6kg" "cat g250" "cat 10 kg"
The next step is to replace each element of found
with string cat
. I have attempted gsub, sub, and gsubfn() from package (gsubfn) according to Stack question 20219311:
gsubfn("((^cat.[a-z][0-9])|(^cat.[0-9]))", "cat",data)
[1] "catkg" "cat50" "cat dog" "cat0 kg"
which is NOT the expected result:
[#] "cat" "cat" "cat dog" "cat"
I think I'm missing a point. I would appreciate any help I could get. Thanks.
回答1:
Simple,,,, Just assign the string cat
to the match elements. This will replace all the chars present in the element with cat
> data <- c("cat 6kg","cat g250", "cat dog","cat 10 kg")
> data[grepl("(^cat.[a-z][0-9])|(^cat.[0-9])",data)] <- "cat"
> data
[1] "cat" "cat" "cat dog" "cat"
or
> data <- c("cat 6kg","cat g250", "cat dog","cat 10 kg")
> data[grepl("^cat.[a-z]?[0-9]",data)] <- "cat"
> data
[1] "cat" "cat" "cat dog" "cat"
回答2:
You could also do
sub('\\s*dog(*SKIP)(*F)|(?<=cat).*', '', data, perl=TRUE)
#[1] "cat" "cat" "cat dog" "cat"
Or
sub('(cat)\\s*([0-9]|[a-z][0-9]).*$', '\\1', data)
#[1] "cat" "cat" "cat dog" "cat"
回答3:
Try this:
gsub('(\\w?[0-9].*)','',data)
#[1] "cat " "cat " "cat dog" "cat "
来源:https://stackoverflow.com/questions/31625706/regular-expression-parsed-with-grepl-replacement