I have string like this:
years<-c(\"20 years old\", \"1 years old\")
I would like to grep only the numeric number from this vector. Expe
We can also use str_extract
from stringr
years<-c("20 years old", "1 years old")
as.integer(stringr::str_extract(years, "\\d+"))
#[1] 20 1
If there are multiple numbers in the string and we want to extract all of them, we may use str_extract_all
which unlike str_extract
returns all the macthes.
years<-c("20 years old and 21", "1 years old")
stringr::str_extract(years, "\\d+")
#[1] "20" "1"
stringr::str_extract_all(years, "\\d+")
#[[1]]
#[1] "20" "21"
#[[2]]
#[1] "1"
You could get rid of all the letters too:
as.numeric(gsub("[[:alpha:]]", "", years))
Likely this is less generalizable though.
After the post from Gabor Grothendieck post at the r-help mailing list
years<-c("20 years old", "1 years old")
library(gsubfn)
pat <- "[-+.e0-9]*\\d"
sapply(years, function(x) strapply(x, pat, as.numeric)[[1]])
A stringr
pipelined solution:
library(stringr)
years %>% str_match_all("[0-9]+") %>% unlist %>% as.numeric
Extract numbers from any string at beginning position.
x <- gregexpr("^[0-9]+", years) # Numbers with any number of digits
x2 <- as.numeric(unlist(regmatches(years, x)))
Extract numbers from any string INDEPENDENT of position.
x <- gregexpr("[0-9]+", years) # Numbers with any number of digits
x2 <- as.numeric(unlist(regmatches(years, x)))