Fill in missing year in ordered list of dates

生来就可爱ヽ(ⅴ<●) 提交于 2019-12-05 10:20:43

Here's one idea

## Make data easily reproducible
df <- data.frame(day=c(24, 21, 20, 10, 20, 20, 10),
                 month = c("Jun", "Mar", "Jan", "Dec", "Jun", "Jan", "Dec"))


## Convert each month-day combo to its corresponding "julian date"
datestring <- paste("2012", match(df[[2]], month.abb), df[[1]], sep = "-")
date <- strptime(datestring, format = "%Y-%m-%d") 
julian <- as.integer(strftime(date, format = "%j"))

## Transitions between years occur wherever julian date increases between
## two observations
df$year <- 2014 - cumsum(diff(c(julian[1], julian))>0)

## Check that it worked
df
#   day month year
# 1  24   Jun 2014
# 2  21   Mar 2014
# 3  20   Jan 2014
# 4  10   Dec 2013
# 5  20   Jun 2013
# 6  20   Jan 2013
# 7  10   Dec 2012

The OP has requested to complete the years in descending order starting in 2014.

Here is an alternative approach which works without date conversion and fake dates. Furthermore, this approach can be modified to work with fiscal years which start on a different month than January.

# create sample dataset
df <- data.frame(
  day = c(24L, 21L, 20L, 10L, 20L, 20L, 21L, 10L, 30L, 10L, 10L, 7L),
  month = c("Jun", "Mar", "Jan", "Dec", "Jun", "Jan", "Jan", "Dec", "Jan", 
            "Jan", "Jan", "Jun"))

df$year <- 2014 - cumsum(c(0L, diff(100L*as.integer(
  factor(df$month, levels = month.abb)) + df$day) > 0))
df
   day month year
1   24   Jun 2014
2   21   Mar 2014
3   20   Jan 2014
4   10   Dec 2013
5   20   Jun 2013
6   20   Jan 2013
7   21   Jan 2012
8   10   Dec 2011
9   30   Jan 2011
10  10   Jan 2011
11  10   Jan 2011
12   7   Jun 2010

Completion of fiscal years

Let's assume the business has decided to start its fiscal year on February 1. Thus, January lies in a different fiscal year than February or March of the same calendar year.

To handle fiscal years, we only need to shuffle the factor levels accordingly:

df$fy <- 2014 - cumsum(c(0L, diff(100L*as.integer(
  factor(df$month, levels = month.abb[c(2:12, 1)])) + df$day) > 0))
df
   day month year   fy
1   24   Jun 2014 2014
2   21   Mar 2014 2014
3   20   Jan 2014 2013
4   10   Dec 2013 2013
5   20   Jun 2013 2013
6   20   Jan 2013 2012
7   21   Jan 2012 2011
8   10   Dec 2011 2011
9   30   Jan 2011 2010
10  10   Jan 2011 2010
11  10   Jan 2011 2010
12   7   Jun 2010 2010
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!