问题
I need to clean up some data strings that have words and numbers or just numbers.
below is a toy sample
library(tidyverse)
c("555","Word 123", "two words 123", "three words here 123") %>%
sub("(\\w+) (\\d*)", "\\1|\\2", .)
The result is this:
[1] "555" "Word|123" "two|words 123" "three|words here 123"
but I want to place the '|' before the last set of numbers like shown below
[1] "|555" "Word|123" "two words|123" "three words here|123"
回答1:
We can use sub to match zero or more spaces (\\s*) followed by a digit we capture as a group ((\\d)) and in the replacement use the | followed by the backreference (\\1) of the captured group
sub("\\s*(\\d)", "|\\1", v1)
#[1] "|555" "Word|123"
#[3] "two words|123" "three words here|123"
data
v1 <- c("555","Word 123", "two words 123", "three words here 123")
回答2:
You may use
^(.*?)\s*(\d*)$
Replace with \1|\2. See the regex demo.
In R:
sub("^(.*?)\\s*(\\d*)$", "\\1|\\2", .)
Details
^- start of string(.*?)- Capturing group 1: any 0+ chars, as few as possible\s*- zero or more whitespaces(\d*)- Capturing group 2: zero or more digits$- end of string.
来源:https://stackoverflow.com/questions/55856172/r-separate-words-from-numbers-in-string