Simple Table with dplyr on Sequence Data

房东的猫 提交于 2019-12-23 01:46:31

问题


I would like to make a simple table with

dplyr 

and

summarise

But I can't really figure out how ... (Even though it should be quite simple).

I have a matrix of sequences. When I simply tabulate

 table(dta) 

I have the result I want.

 dta
            acquaintance                        alone                        child                    notnotnot                      nuclear 
                       1                            2                           17                           19                          131 
 nuclear and     acquaintance  nuclear and    acquaintance    nuclear and  acquaintance     nuclear and acquaintance                      partner 
                       1                            1                            1                           35                            2 

However, I can't figure out how to do the same with summarise

Any suggestion ?

 dta =   structure(c("nuclear", "nuclear", "child", "child", "child", 
"acquaintance", "nuclear and acquaintance", "nuclear and acquaintance", 
"notnotnot", "nuclear", "nuclear", "nuclear", "child", "child", 
"child", "alone", "nuclear and acquaintance", "nuclear and acquaintance", 
"notnotnot", "nuclear", "nuclear", "child", "child", "child", 
"child", "nuclear", "nuclear and acquaintance", "nuclear and acquaintance", 
"notnotnot", "nuclear", "nuclear", "child", "child", "child", 
"nuclear", "nuclear", "nuclear and acquaintance", "nuclear and acquaintance", 
"notnotnot", "nuclear", "nuclear", "nuclear", "child", "child", 
 "nuclear", "nuclear", "nuclear and acquaintance", "nuclear and acquaintance", 
"notnotnot", "nuclear", "nuclear", "nuclear", "nuclear", "nuclear", 
"nuclear", "nuclear", "nuclear and acquaintance", "nuclear and acquaintance", 
"notnotnot", "nuclear", "nuclear", "nuclear", "nuclear", "nuclear", 
"nuclear", "nuclear", "nuclear and acquaintance", "nuclear and acquaintance", 
"notnotnot", "nuclear", "nuclear", "nuclear", "nuclear", "nuclear", 
"nuclear", "nuclear", "nuclear and acquaintance", "nuclear and acquaintance", 
"notnotnot", "nuclear", "nuclear", "nuclear", "nuclear", "nuclear", 
"nuclear", "nuclear", "nuclear and acquaintance", "nuclear and acquaintance", 
"notnotnot", "nuclear", "nuclear", "nuclear", "nuclear", "nuclear", 
"nuclear", "nuclear", "nuclear and acquaintance", "nuclear and acquaintance", 
"notnotnot", "nuclear", "nuclear", "nuclear", "nuclear", "nuclear", 
"nuclear", "nuclear", "nuclear and acquaintance", "nuclear and acquaintance", 
"notnotnot", "nuclear", "nuclear", "nuclear", "nuclear", "nuclear", 
"nuclear", "nuclear", "nuclear and acquaintance", "nuclear and acquaintance", 
"partner", "nuclear", "nuclear", "nuclear", "nuclear", "nuclear", 
"nuclear", "nuclear", "nuclear and acquaintance", "nuclear and acquaintance", 
 "partner", "nuclear", "nuclear", "nuclear", "nuclear", "nuclear and acquaintance", 
 "nuclear", "nuclear", "nuclear and acquaintance", "nuclear and  acquaintance", 
 "notnotnot", "nuclear", "nuclear", "nuclear", "nuclear", "nuclear and     acquaintance", 
 "nuclear", "nuclear", "nuclear and acquaintance", "nuclear and acquaintance", 
 "notnotnot", "nuclear", "nuclear", "nuclear", "nuclear", "nuclear and    acquaintance", 
 "nuclear", "nuclear", "nuclear", "nuclear", "notnotnot", "nuclear", 
  "nuclear", "nuclear", "nuclear", "nuclear and acquaintance", 
 "nuclear", "nuclear", "nuclear", "nuclear", "notnotnot", "nuclear", 
"nuclear", "nuclear", "nuclear", "nuclear and acquaintance", 
"nuclear", "nuclear", "nuclear", "nuclear", "notnotnot", "nuclear", 
"nuclear", "nuclear", "nuclear", "nuclear and acquaintance", 
"nuclear", "nuclear", "nuclear", "nuclear", "notnotnot", "nuclear", 
"nuclear", "nuclear", "nuclear", "nuclear and acquaintance", 
"nuclear", "nuclear", "child", "nuclear", "notnotnot", "nuclear", 
"nuclear", "nuclear", "nuclear", "nuclear and acquaintance", 
"nuclear", "nuclear", "child", "alone", "notnotnot", "nuclear"
), .Dim = c(10L, 21L), .Dimnames = list(c("1", "2", "3", "4", 
"5", "6", "7", "8", "9", "10"), c("12:10", "12:20", "12:30", 
"12:40", "12:50", "13:00", "13:10", "13:20", "13:30", "13:40", 
"13:50", "14:00", "14:10", "14:20", "14:30", "14:40", "14:50", 
"15:00", "15:10", "15:20", "15:30")))

回答1:


You just have to convert your data to a data.frame to use dplyr and then you can easily get your desired output:

require(dplyr)
# ungrouped
data_frame(var = c(dta)) %>% 
  group_by_("var") %>% 
  summarise(n())
##                             var n()
## 1                  acquaintance   1
## 2                         alone   2
## 3                         child  17
## 4                     notnotnot  19
## 5                       nuclear 131
## 6  nuclear and     acquaintance   1
## 7   nuclear and    acquaintance   1
## 8     nuclear and  acquaintance   1
## 9      nuclear and acquaintance  35
## 10                      partner   2

If you want to do this for each column seperately, you can use tidyr to first gather the result and then spread it again.

require(tidyr)
# grouped
dta %>% 
  as.data.frame %>% 
  gather %>% 
  group_by(key, value) %>% 
  summarise(N = n()) %>% 
  spread(key, N)
##                           value 12:10 12:20 12:30 12:40 12:50 13:00 13:10 13:20 13:30 13:40 13:50 14:00 14:10
## 1                  acquaintance     1    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA
## 2                         alone    NA     1    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA
## 3                         child     3     3     4     3     2    NA    NA    NA    NA    NA    NA    NA    NA
## 4                     notnotnot     1     1     1     1     1     1     1     1     1     1     1    NA    NA
## 5                       nuclear     3     3     3     4     5     7     7     7     7     7     7     7     7
## 6  nuclear and     acquaintance    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA
## 7   nuclear and    acquaintance    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA
## 8     nuclear and  acquaintance    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA
## 9      nuclear and acquaintance     2     2     2     2     2     2     2     2     2     2     2     2     2
## 10                      partner    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA     1     1
## Variables not shown: 14:20 (int), 14:30 (int), 14:40 (int), 14:50 (int), 15:00 (int), 15:10 (int), 15:20 (int), 
## 15:30 (int)


来源:https://stackoverflow.com/questions/30941737/simple-table-with-dplyr-on-sequence-data

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!