R regex gsub separate letters and numbers

丶灬走出姿态 提交于 2019-12-04 21:16:21

问题


I have a string that's mixed letters and numbers:

"The sample is 22mg"

I'd like to split strings where a number is immediately followed by letter like this:

"The sample is 22 mg"

I've tried this:

gsub('[0-9]+[[aA-zZ]]', '[0-9]+ [[aA-zZ]]', 'This is a test 22mg')

but am not getting the desired results.

Any suggestions?


回答1:


You need to use capturing parentheses in the regular expression and group references in the replacement. For example:

gsub('([0-9])([[:alpha:]])', '\\1 \\2', 'This is a test 22mg')

There's nothing R-specific here; the R help for regex and gsub should be of some use.




回答2:


You need backreferencing:

test <- "The sample is 22mg"
> gsub("([0-9])([a-zA-Z])","\\1 \\2",test)
[1] "The sample is 22 mg"

Anything in parentheses gets remembered. Then they're accessed by \1 (for the first entity in parens), \2, etc. The first backslash escapes the backslash's interpretation in R so that it gets passed to the regular expression parser.



来源:https://stackoverflow.com/questions/11605564/r-regex-gsub-separate-letters-and-numbers

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!