Create lead and lag variables in R

回眸只為那壹抹淺笑 提交于 2019-12-23 06:22:12

问题


I have to create lead and lag variables like below in R

Suppose i have a dataframe which has details about a customer's visit to any store...

CustomerID  Dateofvisit
1   1/2/2013
1   1/3/2013
1   1/7/2013
2   1/9/2013
2   1/14/2013
2   2/14/2013
3   1/4/2013
3   1/5/2013

As we can see, there are 3 customers with different visit dates.. When i apply a lag function on the above...(i created my own function,)..it becomes like below:

CustomerID  Dateofvisit Laggeddate
1   1/2/2013    -
1   1/3/2013         1/2/2013
1   1/7/2013         1/3/2013
2   1/9/2013         1/7/2013
2   1/14/2013        1/9/2013
2   2/14/2013        1/14/2013
3   1/4/2013         2/14/2013
3   1/5/2013         1/4/2013

But, i want to lag by customer as well. So for the 4th row, the lagged date should be nothing..similarly for the 3rd cstomer, first row/entry should be notihng and on last row, i should see 1/4/2013.. How do i do this?

The following is code i use for lag/lead

shift<-function(x,shift_by){ 
    stopifnot(is.numeric(shift_by)) 
    stopifnot(is.numeric(x)) 

    if (length(shift_by)>1) 
        return(sapply(shift_by,shift, x=x)) 

    out<-NULL
    abs_shift_by=abs(shift_by) 
    if (shift_by > 0 ) 
        out<-c(tail(x,-abs_shift_by),rep(NA,abs_shift_by)) 
    else if (shift_by < 0 ) 
        out<-c(rep(NA,abs_shift_by), head(x,-abs_shift_by)) 
    else 
        out<-x 
    out 
}

and how i lead/lag them:

#generate lead by 1 variable 
test$df_lead2<-shift(test$x,1) 
#generate lag by 1 variable 
test$df_lag2<-shift(test$x,-1) 

My desired output is:

CustomerID  Dateofvisit Laggeddate
1   1/2/2013    -
1   1/3/2013         1/2/2013
1   1/7/2013         1/3/2013
2   1/9/2013         -
2   1/14/2013        1/9/2013
2   2/14/2013        1/14/2013
3   1/4/2013         -
3   1/5/2013         1/4/2013

回答1:


Is this what you want?

library(plyr)
ddply(.data = df, .variables = .(CustomerID), mutate,
   lagdate = c(NA, head(Dateofvisit, -1)),
   leaddate = c(tail(Dateofvisit, -1), NA))


来源:https://stackoverflow.com/questions/18486649/create-lead-and-lag-variables-in-r

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!