问题
I am trying to calculate the percentage of different levels of a factor within a group.
I have nested data and would like to see the percentage of schools in each country is a private schools (factor with 2 levels).
However, I cannot figure out how to do that.
# my data:
CNT <- c("A", "A", "A", "A", "A", "B", "B", "B", "C", "C", "C", "C", "C", "C", "D", "D",
"D", "D", "D", "D")
SCHOOL <- c(1:5, 1:3, 1:6, 1:6)
FACTOR <- as.factor(c(1,2,1,2,1,1,1,2,1,2,2,2,1,1,1,1,1,1,1,1))
mydata <- data.frame(CNT, SCHOOL, FACTOR)
head(mydata)
I want a column with the percentage of one level of the Factor (lets say 1) within each country.
回答1:
Another solution (with base-R):
prop.table(table(mydata$CNT, mydata$FACTOR), margin = 1)
1 2
A 0.6000000 0.4000000
B 0.6666667 0.3333333
C 0.5000000 0.5000000
D 1.0000000 0.0000000
回答2:
Just group your data by CNT
and then summarise the groups to calculate how many instances of FACTOR == 1
you have vs the total number of observations within that group (n()
).
library(dplyr)
mydata %>%
group_by(CNT) %>%
summarise(
priv_perc = sum(FACTOR == 1, na.rm=T) / n()
)
来源:https://stackoverflow.com/questions/62953231/percentage-of-factor-levels-by-group-in-r