generate sequence (and starting over in case of a recurrence) and add new column with highest number per sequence, within group, in R

家住魔仙堡 提交于 2019-12-06 06:11:59

Construct a 'run-length encoding' and use that to generate the sequences

rle <- rle(as.character(mydf$City))
mydf$Sequence <- unlist(lapply(rle$length, seq_len))

For the updated question, where two columns form the key, paste the columns together with a unique symbol and compute with that

rle <- rle(paste(mydf$ID, mydf$City, sep = "\r"))
mydf$Sequence <- unlist(lapply(rle$length, seq_len))

This will be 'fast', especially compared to a for loop.

A good old for loop does the trick

mydf$Sequence <- NA

for(i in seq_len(nrow(mydf))) {
  if (i == 1 || (mydf$City[i] != mydf$City[i-1]) || (mydf$ID[i] != mydf$ID[i-1]))
    mydf$Sequence[i] <- 1
  else
    mydf$Sequence[i] <- mydf$Sequence[i-1] + 1

}
标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!