I am having trouble calculating a date that is imported in from a .csv file. What I want to do is take that date in the factor DateClosed and generate a date in a date field (a). Example if a=203 I want the date to be the equivalent of DateClosed-203. However, I am having trouble with the code listed below.
DateClose is a factor.
> head(DateClosed)
[1] 7/30/2007 12/12/2007 5/8/2009 6/24/2009 6/24/2009 2/29/2008
165 Levels: 1/12/2010 1/15/2011 1/15/2013 1/17/2009 1/18/2008 1/19/2012 1/2/2013 1/21/2013 1/22/2010 1/24/2013 1/26/2014 ... 9/7/2010
> head(as.Date(DateClosed,format="%m/%d/%y"))
[1] "2020-07-30" "2020-12-12" "2020-05-08" "2020-06-24" "2020-06-24" "2020-02-29"
head(as.Date(DateClosed,format="%m/%d/%y"))-203
[1] "2020-01-09" "2020-05-23" "2019-10-18" "2019-12-04" "2019-12-04" "2019-08-10"
It subtracts 203 days correctly but for some reason reads the date wrong.
DateClosed <- factor(c("7/30/2007","12/12/2007", "5/8/2009"))
as.Date(DateClosed, format="%m/%d/%Y")
Produces:
[1] "2007-07-30" "2007-12-12" "2009-05-08"
Notice the capital "Y" in the format
param. The lower case "y" is for 2 digit years, so as.Date
reads the first two digits of the year token ("20"), and then assumes that refers to just the last two digits of the year, and adds the current date's century (also "20"), so you end up with dates in 2020.
Manipulating dates becomes really easy using lubridate
package.
mdy(factor(c("7/30/2007","12/12/2007", "5/8/2009")))
"2007-07-30 UTC" "2007-12-12 UTC" "2009-05-08 UTC"
Or using parse_date_time
with the same package:
parse_date_time(factor(c("7/30/2007","12/12/2007", "5/8/2009")),c('mdY'))
[1] "2007-07-30 UTC" "2007-12-12 UTC" "2009-05-08 UTC"
来源:https://stackoverflow.com/questions/22210160/incorrect-conversion-of-date-as-a-factor-to-a-date