问题
I have to create lead and lag variables like below in R
Suppose i have a dataframe which has details about a customer's visit to any store...
CustomerID Dateofvisit
1 1/2/2013
1 1/3/2013
1 1/7/2013
2 1/9/2013
2 1/14/2013
2 2/14/2013
3 1/4/2013
3 1/5/2013
As we can see, there are 3 customers with different visit dates.. When i apply a lag function on the above...(i created my own function,)..it becomes like below:
CustomerID Dateofvisit Laggeddate
1 1/2/2013 -
1 1/3/2013 1/2/2013
1 1/7/2013 1/3/2013
2 1/9/2013 1/7/2013
2 1/14/2013 1/9/2013
2 2/14/2013 1/14/2013
3 1/4/2013 2/14/2013
3 1/5/2013 1/4/2013
But, i want to lag by customer as well. So for the 4th row, the lagged date should be nothing..similarly for the 3rd cstomer, first row/entry should be notihng and on last row, i should see 1/4/2013.. How do i do this?
The following is code i use for lag/lead
shift<-function(x,shift_by){
stopifnot(is.numeric(shift_by))
stopifnot(is.numeric(x))
if (length(shift_by)>1)
return(sapply(shift_by,shift, x=x))
out<-NULL
abs_shift_by=abs(shift_by)
if (shift_by > 0 )
out<-c(tail(x,-abs_shift_by),rep(NA,abs_shift_by))
else if (shift_by < 0 )
out<-c(rep(NA,abs_shift_by), head(x,-abs_shift_by))
else
out<-x
out
}
and how i lead/lag them:
#generate lead by 1 variable
test$df_lead2<-shift(test$x,1)
#generate lag by 1 variable
test$df_lag2<-shift(test$x,-1)
My desired output is:
CustomerID Dateofvisit Laggeddate
1 1/2/2013 -
1 1/3/2013 1/2/2013
1 1/7/2013 1/3/2013
2 1/9/2013 -
2 1/14/2013 1/9/2013
2 2/14/2013 1/14/2013
3 1/4/2013 -
3 1/5/2013 1/4/2013
回答1:
Is this what you want?
library(plyr)
ddply(.data = df, .variables = .(CustomerID), mutate,
lagdate = c(NA, head(Dateofvisit, -1)),
leaddate = c(tail(Dateofvisit, -1), NA))
来源:https://stackoverflow.com/questions/18486649/create-lead-and-lag-variables-in-r