R separate words from numbers in string

我只是一个虾纸丫 提交于 2020-03-26 04:53:32

问题


I need to clean up some data strings that have words and numbers or just numbers.

below is a toy sample

library(tidyverse)

c("555","Word 123", "two words 123", "three words here 123") %>%  
sub("(\\w+) (\\d*)",  "\\1|\\2", .)

The result is this:

[1] "555"                  "Word|123"             "two|words 123"        "three|words here 123"

but I want to place the '|' before the last set of numbers like shown below

[1] "|555"                  "Word|123"             "two words|123"        "three words here|123"

回答1:


We can use sub to match zero or more spaces (\\s*) followed by a digit we capture as a group ((\\d)) and in the replacement use the | followed by the backreference (\\1) of the captured group

sub("\\s*(\\d)", "|\\1", v1)
#[1] "|555"                 "Word|123"            
#[3] "two words|123"        "three words here|123"

data

v1 <- c("555","Word 123", "two words 123", "three words here 123")



回答2:


You may use

^(.*?)\s*(\d*)$

Replace with \1|\2. See the regex demo.

In R:

sub("^(.*?)\\s*(\\d*)$", "\\1|\\2", .)

Details

  • ^ - start of string
  • (.*?) - Capturing group 1: any 0+ chars, as few as possible
  • \s* - zero or more whitespaces
  • (\d*) - Capturing group 2: zero or more digits
  • $ - end of string.


来源:https://stackoverflow.com/questions/55856172/r-separate-words-from-numbers-in-string

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!