This is a follow-up question for this post: Loop through dataframe in R and measure time difference between two values
I already got excellent help with the following co
Try this.
# df$Date <- as.POSIXct(strptime(df$Date,"%d.%m.%Y %H:%M"))
df %>%
arrange(User, Date) %>%
group_by(User) %>%
mutate(
last.date = Date[which(StimuliA == 1L)[c(1,1:sum(StimuliA == 1L))][cumsum(StimuliA == 1L)+ 1]]
) %>%
mutate(
timesince = ifelse(Responses == 1L, Date - last.date, NA)
)
This works by first creating a column that records the data of last stimuli, and then using ifelse
and lag
to get the difference between the current date and the last stimuli date. You can filter
to extract only the LAST response.
There is a cleaner way to do the "last.date" operation with zoo.na.locf
, but I didn't want to assume you were ok with another package dependency.
EDIT To identify the sequence (if I correctly understand what you mean by "sequence"), continue the chain with
%>% mutate(sequence = cumsum(StimuliA))
to identify sequences defined as observations following a positive Stimuli. To filter out the last response of a sequence, continue the chain with
%>% group_by(User, sequence) %>%
filter(timesince == max(timesince, na.rm = TRUE))
to group by sequence (and user) and then extract the maximum time difference associated with each sequence (which will correspond to the last positive response of a sequence).