问题
I have a dataset which has project name, start year and contract term. I need to develop this dataset into time series. For example, one row in my dataset is: Project A, start year 2003 and contract term 5. I would like to repeat each row based on contract term. My dataset looks like this:
Project Name Start Year Contract Term
A 2003 5
B 2013 3
C 2000 2
My desired result should look like this:
Project Name Start Year Contract Term
A 2003 5
A 2004 5
A 2005 5
A 2006 5
A 2007 5
B 2013 3
B 2014 3
B 2014 3
C 2000 2
C 2001 2
I have tried:
rpsData <- rpsInput[rep(rownames(rpsInput), rpsInput$Contract.Term), ]
But this only repeats each project by the number in contract term. I can not make it to increment the years.
Thanks in advance!
回答1:
Here it is in two steps:
Step 1, you know:
rpsData <- rpsInput[rep(rownames(rpsInput), rpsInput$Contract.Term), ]
rpsData
# Project.Name Start.Year Contract.Term
# 1 A 2003 5
# 1.1 A 2003 5
# 1.2 A 2003 5
# 1.3 A 2003 5
# 1.4 A 2003 5
# 2 B 2013 3
# 2.1 B 2013 3
# 2.2 B 2013 3
# 3 C 2000 2
# 3.1 C 2000 2
Step 2 makes use of sequence
and basic addition:
sequence(rpsInput$Contract.Term) ## This will be helpful...
# [1] 1 2 3 4 5 1 2 3 1 2
rpsData$Start.Year <- rpsData$Start.Year + sequence(rpsInput$Contract.Term)
rpsData
# Project.Name Start.Year Contract.Term
# 1 A 2004 5
# 1.1 A 2005 5
# 1.2 A 2006 5
# 1.3 A 2007 5
# 1.4 A 2008 5
# 2 B 2014 3
# 2.1 B 2015 3
# 2.2 B 2016 3
# 3 C 2001 2
# 3.1 C 2002 2
回答2:
Just to piggy back on Ananda's answer, change
sequence(rpsInput$Contract.Term)
to
(sequence(rpsInput$Contract.Term)-1)
to get the output you desire.
ProjectName<-c("A","B","C")
Start.Year<-c(2003,2013,2000)
Contract.Term<-c(5,3,2)
rpsInput<-data.frame(ProjectName,Start.Year,Contract.Term)
rpsData <- rpsInput[rep(rownames(rpsInput), rpsInput$Contract.Term), ]
rpsData$Start.Year <- rpsData$Start.Year + (sequence(rpsInput$Contract.Term)-1)
rpsData
# ProjectName Start.Year Contract.Term
#1 A 2003 5
#1.1 A 2004 5
#1.2 A 2005 5
#1.3 A 2006 5
#1.4 A 2007 5
#2 B 2013 3
#2.1 B 2014 3
#2.2 B 2015 3
#3 C 2000 2
#3.1 C 2001 2
来源:https://stackoverflow.com/questions/23138461/repeat-rows-in-a-dataset-based-on-a-column-but-increment-the-rows