R count occurrences of an element by groups [duplicate]

时光总嘲笑我的痴心妄想 提交于 2020-12-11 10:08:27

问题


What is the easiest way to count the occurrences of a an element on a vector or data.frame at every grouop?
I don't mean just counting the total (as other stackoverflow questions ask) but giving a different number to every succesive occurence.

for example for this simple dataframe: (but I will work with dataframes with more columns)

mydata <- data.frame(A=c("A","A","A","B","B","A", "A"))

I've found this solution:

cbind(mydata,myorder=ave(rep(1,nrow(mydata)),mydata$A, FUN=cumsum))   

and here the result:

 A myorder  
 A       1  
 A       2  
 A       3  
 B       1  
 B       2  
 A       4  
 A       5  

Isn't there any single command to do it?. Or using an specialized package?

I want it to later use tidyr's spread() function.

My question is not the same than Is there an aggregate FUN option to count occurrences? because I don't want to know the total number of occurrencies at the end but the cumulative occurencies till every element.

OK, my problem is a little bit more complex

mydata <- data.frame(group=c("x","x","x","x","y","y", "y"), letter=c("A","A","A","B","B","A", "A"))

I only know to solve the first example I wrote above. But what happens when I want it also by a second grouping variable? something like occurrencies(letter) by group.

group letter  "occurencies within group"  
 x      A       1  
 x      A       2  
 x      A       3  
 x      B       1  
 y      B       1  
 y      A       1  
 y      A       2  

I've found the way with

ave(rep(1,nrow(mydata)),list(mydata$group, mydata$letter), FUN=cumsum)
though it shoould be something easier.


回答1:


Using data.table

library(data.table)
setDT(mydata)
mydata[, myorder := 1:.N, by = .(group, letter)]

The by argument makes the table be dealt with within the groups of the column called A. .N is the number of rows within that group (if the by argument was empty it would be the number of rows in the table), so for each sub-table, each row is indexed from 1 to the number of rows in that sub-table.

mydata
   group letter myorder
1:     x      A       1
2:     x      A       2
3:     x      A       3
4:     x      B       1
5:     y      B       1
6:     y      A       1
7:     y      A       2

or a dplyr solution which is pretty much the same

mydata %>% 
  group_by(group, letter) %>% 
  mutate(myorder = 1:n())


来源:https://stackoverflow.com/questions/32586674/r-count-occurrences-of-an-element-by-groups

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!