问题
I have a data set full of factors and dummies, I want to see the proportion of each value after dplyr::group_by(cyl)
mtcars; rownames(mtcars) <- NULL
df <- mtcars[,c(2,8,9)]
head(df)
cyl vs am
1 6 0 1
2 6 0 1
3 4 1 1
4 6 1 0
5 8 0 0
6 6 1 0
Expected answer
I have in cyl
6 6 6 6 for vs
column two of them is 1 two of them 0
1 0
6 50% 50%
4 100% 0%
8 0% 100%
same as this for column am
too
回答1:
Here's a first crack:
(df
%>% pivot_longer(-cyl) ## spread out variables (vs, am)
%>% group_by(cyl,name)
%>% mutate(n=n()) ## obs per cyl/var combo
%>% group_by(cyl,name,value)
%>% summarise(prop=n()/n) ## proportion of 0/1 per cyl/var
%>% unique() ## not sure why I need this?
%>% pivot_wider(id_cols=c(cyl,name),names_from=value,values_from=prop)
)
Results:
cyl name `0` `1`
<dbl> <chr> <dbl> <dbl>
1 4 am 0.273 0.727
2 4 vs 0.0909 0.909
3 6 am 0.571 0.429
...
来源:https://stackoverflow.com/questions/64783023/proportion-of-factors-and-dummies