I wrote code to extract the date from a given string. Given
> \"Date: 2012-07-29, 12:59AM PDT\"
it extracts
> \"
Regex with backreferencing works:
> sub("^.+([0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9]).+$","\\1","Date: 2012-07-29, 12:59AM PDT")
[1] "2012-07-29"
But @Dirk is right that parsing it as a date is the right way to go.
As (pretty much) always, you've got multiple options here. Though none of them really frees you from getting used to some basic regular expression syntax (or its close friends).
raw_date <- "Date: 2012-07-29, 12:59AM PDT"
> gsub(",", "", unlist(strsplit(raw_date, split=" "))[2])
[1] "2012-07-29"
> temp <- gsub(".*: (?=\\d?)", "", raw_date, perl=TRUE)
> out <- gsub("(?<=\\d),.*", "", temp, perl=TRUE)
> out
[1] "2012-07-29"
> require("stringr")
> str_extract(raw_date, "\\d{4}-\\d{2}-\\d{2}")
[1] "2012-07-29"
Something along the lines of this should work:
x <- "Date: 2012-07-29, 12:59AM PDT"
as.Date(substr(x, 7, 16), format="%Y-%m-%d")
You can use strptime()
to parse time objects:
R> strptime("Date: 2012-07-29, 11:59AM PDT", "Date: %Y-%m-%d, %I:%M%p", tz="PDT")
[1] "2012-07-29 11:59:00 PDT"
R>
Note that I shifted your input string as I am unsure that 12:59AM exists... Just to prove the point, shifted by three hours (expressed in seconds, the base units):
R> strptime("Date: 2012-07-29, 11:59AM PDT",
+> "Date: %Y-%m-%d, %I:%M%p", tz="PDT") + 60*60*3
[1] "2012-07-29 14:59:00 PDT"
R>
Oh, and if you just want the date, it is of course even simpler:
R> as.Date(strptime("Date: 2012-07-29, 11:59AM PDT", "Date: %Y-%m-%d"))
[1] "2012-07-29"
R>