Suppose I have a data set data
:
x1 <- c("a","a","a","a","a","a","b","b","b","b")
x2 <- c("a1","a1","a1","a1","a1","a1","b1","b1","b2","b2")
data <- data.frame(x1,x2)
x1 x2
a a1
a a1
a a2
a a1
a a2
a a3
b b1
b b1
b b2
b b2
I want to find the number of unique values of x1
corresponding to x2
For example a
has only 3 unique values (a1,a2
and a3
) and b
has 2 values (b1
and b2
)
I used aggregate(x1~.,data,sum)
but it did not work since these are factors, not integers.
Please help
Try
aggregate(x2~x1, data, FUN=function(x) length(unique(x)))
# x1 x2
#1 a 3
#2 b 2
Or
rowSums(table(unique(data)))
Or
library(dplyr)
data %>%
group_by(x1) %>%
summarise(n=n_distinct(x2))
Or another option using dplyr
suggested by @Eric
count(distinct(data), x1)
Or
library(data.table)
setDT(data)[, uniqueN(x2) , x1]
Update
If you need both the unique
values of 'x2' and the count
setDT(data)[, list(n=uniqueN(x2), x2=unique(x2)) , x1]
Or only the unique
values
setDT(data)[, list(x2=unique(x2)) , x1]
Or using dplyr
unique(data, by=x1) %>%
group_by(x1) %>%
mutate(n=n_distinct(x2))
only for unique values
unique(data, by=x1)
来源:https://stackoverflow.com/questions/29001141/how-to-aggregate-count-of-unique-values-of-categorical-variables-in-r