Extracting numbers from vectors of strings

前端 未结 11 1110
佛祖请我去吃肉
佛祖请我去吃肉 2020-11-22 04:05

I have string like this:

years<-c(\"20 years old\", \"1 years old\")

I would like to grep only the numeric number from this vector. Expe

相关标签:
11条回答
  • 2020-11-22 04:20

    Or simply:

    as.numeric(gsub("\\D", "", years))
    # [1] 20  1
    
    0 讨论(0)
  • 2020-11-22 04:22

    I think that substitution is an indirect way of getting to the solution. If you want to retrieve all the numbers, I recommend gregexpr:

    matches <- regmatches(years, gregexpr("[[:digit:]]+", years))
    as.numeric(unlist(matches))
    

    If you have multiple matches in a string, this will get all of them. If you're only interested in the first match, use regexpr instead of gregexpr and you can skip the unlist.

    0 讨论(0)
  • 2020-11-22 04:29

    Update Since extract_numeric is deprecated, we can use parse_number from readr package.

    library(readr)
    parse_number(years)
    

    Here is another option with extract_numeric

    library(tidyr)
    extract_numeric(years)
    #[1] 20  1
    
    0 讨论(0)
  • 2020-11-22 04:32

    How about

    # pattern is by finding a set of numbers in the start and capturing them
    as.numeric(gsub("([0-9]+).*$", "\\1", years))
    

    or

    # pattern is to just remove _years_old
    as.numeric(gsub(" years old", "", years))
    

    or

    # split by space, get the element in first index
    as.numeric(sapply(strsplit(years, " "), "[[", 1))
    
    0 讨论(0)
  • 2020-11-22 04:38

    Using the package unglue we can do :

    # install.packages("unglue")
    library(unglue)
    
    years<-c("20 years old", "1 years old")
    unglue_vec(years, "{x} years old", convert = TRUE)
    #> [1] 20  1
    

    Created on 2019-11-06 by the reprex package (v0.3.0)

    More info: https://github.com/moodymudskipper/unglue/blob/master/README.md

    0 讨论(0)
  • 2020-11-22 04:39

    Here's an alternative to Arun's first solution, with a simpler Perl-like regular expression:

    as.numeric(gsub("[^\\d]+", "", years, perl=TRUE))
    
    0 讨论(0)
提交回复
热议问题