How to aggregate count of unique values of categorical variables in R

女生的网名这么多〃 提交于 2019-12-06 05:41:23

问题


Suppose I have a data set data:

x1 <- c("a","a","a","a","a","a","b","b","b","b")
x2 <- c("a1","a1","a1","a1","a1","a1","b1","b1","b2","b2")
data <- data.frame(x1,x2)

x1 x2
a  a1
a  a1 
a  a2
a  a1
a  a2
a  a3
b  b1
b  b1
b  b2 
b  b2

I want to find the number of unique values of x1 corresponding to x2

For example a has only 3 unique values (a1,a2 and a3) and b has 2 values (b1 and b2)

I used aggregate(x1~.,data,sum) but it did not work since these are factors, not integers.

Please help


回答1:


Try

 aggregate(x2~x1, data, FUN=function(x) length(unique(x)))
 #  x1 x2
 #1  a  3
 #2  b  2

Or

 rowSums(table(unique(data)))

Or

library(dplyr)
data %>% 
     group_by(x1) %>%
     summarise(n=n_distinct(x2))

Or another option using dplyr suggested by @Eric

count(distinct(data), x1)

Or

library(data.table)
setDT(data)[, uniqueN(x2) , x1]

Update

If you need both the unique values of 'x2' and the count

setDT(data)[, list(n=uniqueN(x2), x2=unique(x2)) , x1]

Or only the unique values

setDT(data)[, list(x2=unique(x2)) , x1]

Or using dplyr

 unique(data, by=x1) %>% 
                   group_by(x1) %>%
                   mutate(n=n_distinct(x2))

only for unique values

unique(data, by=x1)


来源:https://stackoverflow.com/questions/29001141/how-to-aggregate-count-of-unique-values-of-categorical-variables-in-r

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!