问题
I have a string of characters.
str = c(".wow", "if.", "not.confident", "wonder", "have.difficulty", "shower")
I am trying to replace "." in between words with a whitespace. So it would look like this
".wow", "if.", "not confident", "wonder", "have difficulty", "shower"
First, I tried
gsub("[\\w.\\w]", " ", str)
[1] " o " "if" "not confident" " onder"
[5] "have difficulty" "sho er "
It gave me the whitespace I want, but chopped off all the w's. Then, I tried
gsub("\\w\\.\\w", " ", str)
[1] ".wow" "if" "no onfident" "wonder"
[5] "hav ifficulty" "shower."
It kept the w's, but brought away other characters right before and after ".".
I cannot use this either
gsub("\\.", " ", str)
[1] " wow" "if " "not.confident" "wonder"
[5] "have.difficulty" "shower"
because it will take away "." not in between words.
回答1:
Try
gsub('(\\w)\\.(\\w)', '\\1 \\2', str)
#[1] ".wow" "if." "not confident" "wonder"
#[5] "have difficulty" "shower"
Or
gsub('(?<=[^.])[.](?=[^.])', ' ', str, perl=TRUE)
Or as @rawr suggested
gsub('\\b\\.\\b', ' ', str, perl = TRUE)
回答2:
Using capturing groups and back-references:
sub('(\\w)\\.(\\w)', '\\1 \\2', str)
# [1] ".wow" "if." "not confident" "wonder"
# [5] "have difficulty" "shower"
A capturing group can be created by placing the characters to be grouped inside a set of parenthesis ( ... )
. Backreferences recall what was matched by a capturing group.
A backreference is specified as (\
); followed by a digit indicating the number of the group.
Using lookaround assertions:
Lookarounds are zero-width assertions. They don't "consume" any characters on the string.
sub('(?<=\\w)\\.(?=\\w)', ' ', str, perl = TRUE)
来源:https://stackoverflow.com/questions/29476002/how-to-substitute-a-special-character-between-words-in-r