How to replace \r\n characters in a text string specifically in R

房东的猫 提交于 2021-02-11 14:38:33

问题


For the life of me, I am unable to strip out some escape characters from a text string (prior to further processing). I've tried stringi, gsub, but I just cannot get the correct syntax.

Here is my text string

txt <- "c(\"\\r\\n    Stuff from a webpage: That I scraped using webcrawler\\r\\n\", \"\\r\\n        \", \"\\r\\n        \", \"\\r\\n        \", \"\\r\\n\\r\\n        \", \"\\r\\n\\r\\n        \", \"\\r\\n        \\r\\n    \", \"\\r\\n    \")"

I'd like to strip out "\\r\\n" from this string.

I've tried

gsub("[\\\r\\\n]", "", txt)  (leaves me with "rn")
gsub("[\\r\\n]", "", txt)    (leaves me without ANY r or n in the text)
gsub("[\r\n]", "", txt)      (strips nothing)

How can I remove these characters? Bear in mind that this will need to work over other entries that may have normal words ending in "rn" or have "rn" in the middle somewhere!

Thanks!


回答1:


Not very pretty, but this works:

library(stringr)
str_remove_all(txt, "(?<=\\\\n)\\s+|\\s+(?=\\\")|\\\"|(?<=\\\"),|\\\\r(?=\\\\n)|(?<=\\\\r)\\\\n")
[1] "c(Stuff from a webpage: That I scraped using webcrawler)"

I'm sure there are more efficient regex solutions, but I just fed it every possibility of things you don't want.

I also got rid of all the extra "\", ",", and white space.

If you just want to match the result that you posted above:

str_remove_all(txt, "\\\\r(?=\\\\n)|(?<=\\\\r)\\\\n")

This reads remove any instance of \\r followed by \\n or any \\n preceded by \\r




回答2:


At the risk of answering my own question too quickly, I've found a bodge workaround which simply involves switching out the "\" for a rare place holder, "__", then replacing that:

gsub('__r__n', '', gsub('[\\\\]', '__', txt))

... but it would be valuable I think to share a better "one hit" solution.



来源:https://stackoverflow.com/questions/51384784/how-to-replace-r-n-characters-in-a-text-string-specifically-in-r

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!