I have a database and throughout the text there are some quotes that are in quotation marks. I would like to remove all the dots \".\" that are enclosed in quotation marks i
mystring <-'"é preciso olhar para o futuro. vou atuar" no front em que posso
fazer alguma coisa "para .frente", disse jose.'
You can use the following pattern
with gsub
:
gsub('(?!(([^"]*"){2})*[^"]*$)\\.', "", mystring, perl = T)
Same with stringr
:
str_replace_all(mystring, '(?!(([^"]*"){2})*[^"]*$)\\.', '')
Output:
#> "é preciso olhar para o futuro vou atuar" no front em que posso
#> fazer alguma coisa "para frente", disse jose.
You may simply use str_replace_all
with a mere "[^"]*"
pattern and use a callback function as the replacement argument to remove all dots with a gsub
call:
str_replace_all(string, '"[^"]*"', function(x) gsub(".", "", x, fixed=TRUE))
So,
"[^"]*"
matches all substrings in string
starting with "
, then having 0+ chars other than "
and then a "
x
where gsub(".", "", x, fixed=TRUE)
replaces all .
(fixed=TRUE
makes it a literal dot, not a regex pattern) with an empty string.