Which regex removes punctuation from quotation marks in text

后端 未结 2 754
孤独总比滥情好
孤独总比滥情好 2021-01-13 19:46

I have a database and throughout the text there are some quotes that are in quotation marks. I would like to remove all the dots \".\" that are enclosed in quotation marks i

相关标签:
2条回答
  • 2021-01-13 20:38
    mystring <-'"é preciso olhar para o futuro. vou atuar" no front em que posso 
    fazer alguma coisa "para .frente", disse jose.'
    

    You can use the following pattern with gsub:

    gsub('(?!(([^"]*"){2})*[^"]*$)\\.', "", mystring, perl = T)
    

    Same with stringr:

    str_replace_all(mystring, '(?!(([^"]*"){2})*[^"]*$)\\.', '')
    

    Output:

    #> "é preciso olhar para o futuro vou atuar" no front em que posso 
    #> fazer alguma coisa "para frente", disse jose.
    
    0 讨论(0)
  • 2021-01-13 20:45

    You may simply use str_replace_all with a mere "[^"]*" pattern and use a callback function as the replacement argument to remove all dots with a gsub call:

    str_replace_all(string, '"[^"]*"', function(x) gsub(".", "", x, fixed=TRUE))
    

    So,

    • "[^"]*" matches all substrings in string starting with ", then having 0+ chars other than " and then a "
    • Once the match is found, it is passed to the callback as x where gsub(".", "", x, fixed=TRUE) replaces all . (fixed=TRUE makes it a literal dot, not a regex pattern) with an empty string.
    0 讨论(0)
提交回复
热议问题