repeat rows in a dataset based on a column, but increment the rows [duplicate]

我的未来我决定 提交于 2019-12-11 03:44:43

问题


I have a dataset which has project name, start year and contract term. I need to develop this dataset into time series. For example, one row in my dataset is: Project A, start year 2003 and contract term 5. I would like to repeat each row based on contract term. My dataset looks like this:

Project Name    Start Year    Contract Term
A                 2003            5
B                 2013            3
C                 2000            2

My desired result should look like this:

Project Name    Start Year    Contract Term
A                 2003            5
A                 2004            5
A                 2005            5
A                 2006            5
A                 2007            5

B                 2013            3
B                 2014            3
B                 2014            3

C                 2000            2
C                 2001            2

I have tried:

rpsData <- rpsInput[rep(rownames(rpsInput), rpsInput$Contract.Term), ]

But this only repeats each project by the number in contract term. I can not make it to increment the years.

Thanks in advance!


回答1:


Here it is in two steps:

Step 1, you know:

rpsData <- rpsInput[rep(rownames(rpsInput), rpsInput$Contract.Term), ]
rpsData
#     Project.Name Start.Year Contract.Term
# 1              A       2003             5
# 1.1            A       2003             5
# 1.2            A       2003             5
# 1.3            A       2003             5
# 1.4            A       2003             5
# 2              B       2013             3
# 2.1            B       2013             3
# 2.2            B       2013             3
# 3              C       2000             2
# 3.1            C       2000             2

Step 2 makes use of sequence and basic addition:

sequence(rpsInput$Contract.Term) ## This will be helpful...
#  [1] 1 2 3 4 5 1 2 3 1 2

rpsData$Start.Year <- rpsData$Start.Year + sequence(rpsInput$Contract.Term)
rpsData
#     Project.Name Start.Year Contract.Term
# 1              A       2004             5
# 1.1            A       2005             5
# 1.2            A       2006             5
# 1.3            A       2007             5
# 1.4            A       2008             5
# 2              B       2014             3
# 2.1            B       2015             3
# 2.2            B       2016             3
# 3              C       2001             2
# 3.1            C       2002             2



回答2:


Just to piggy back on Ananda's answer, change

sequence(rpsInput$Contract.Term)

to

(sequence(rpsInput$Contract.Term)-1)

to get the output you desire.

ProjectName<-c("A","B","C")
Start.Year<-c(2003,2013,2000)
Contract.Term<-c(5,3,2)
rpsInput<-data.frame(ProjectName,Start.Year,Contract.Term)
rpsData <- rpsInput[rep(rownames(rpsInput), rpsInput$Contract.Term), ]
rpsData$Start.Year <- rpsData$Start.Year + (sequence(rpsInput$Contract.Term)-1)
rpsData
#    ProjectName Start.Year Contract.Term
#1             A       2003             5
#1.1           A       2004             5
#1.2           A       2005             5
#1.3           A       2006             5
#1.4           A       2007             5
#2             B       2013             3
#2.1           B       2014             3
#2.2           B       2015             3
#3             C       2000             2
#3.1           C       2001             2


来源:https://stackoverflow.com/questions/23138461/repeat-rows-in-a-dataset-based-on-a-column-but-increment-the-rows

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!