Is there a shorter way to extract a date from a string?

前端 未结 4 2106
轮回少年
轮回少年 2021-01-12 02:14

I wrote code to extract the date from a given string. Given

  > \"Date: 2012-07-29, 12:59AM PDT\"

it extracts

  > \"         


        
相关标签:
4条回答
  • 2021-01-12 02:47

    Regex with backreferencing works:

    > sub("^.+([0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9]).+$","\\1","Date: 2012-07-29, 12:59AM PDT")
    [1] "2012-07-29"
    

    But @Dirk is right that parsing it as a date is the right way to go.

    0 讨论(0)
  • 2021-01-12 02:59

    As (pretty much) always, you've got multiple options here. Though none of them really frees you from getting used to some basic regular expression syntax (or its close friends).

    raw_date <- "Date: 2012-07-29, 12:59AM PDT"
    

    Alternative 1

    > gsub(",", "", unlist(strsplit(raw_date, split=" "))[2])
    [1] "2012-07-29"
    

    Alternative 2

    > temp <- gsub(".*: (?=\\d?)", "", raw_date, perl=TRUE)
    > out <- gsub("(?<=\\d),.*", "", temp, perl=TRUE)
    > out
    [1] "2012-07-29"
    

    Alternative 3

    > require("stringr")
    > str_extract(raw_date, "\\d{4}-\\d{2}-\\d{2}")
    [1] "2012-07-29"
    
    0 讨论(0)
  • 2021-01-12 03:03

    Something along the lines of this should work:

    x <- "Date: 2012-07-29, 12:59AM PDT"
    as.Date(substr(x, 7, 16), format="%Y-%m-%d")
    
    0 讨论(0)
  • 2021-01-12 03:09

    You can use strptime() to parse time objects:

    R> strptime("Date: 2012-07-29, 11:59AM PDT", "Date: %Y-%m-%d, %I:%M%p", tz="PDT")
    [1] "2012-07-29 11:59:00 PDT"
    R> 
    

    Note that I shifted your input string as I am unsure that 12:59AM exists... Just to prove the point, shifted by three hours (expressed in seconds, the base units):

    R> strptime("Date: 2012-07-29, 11:59AM PDT", 
    +>          "Date: %Y-%m-%d, %I:%M%p", tz="PDT") + 60*60*3
    [1] "2012-07-29 14:59:00 PDT"
    R> 
    

    Oh, and if you just want the date, it is of course even simpler:

    R> as.Date(strptime("Date: 2012-07-29, 11:59AM PDT", "Date: %Y-%m-%d"))
    [1] "2012-07-29"
    R> 
    
    0 讨论(0)
提交回复
热议问题