Crosstab with multiple items

前端 未结 7 966
天命终不由人
天命终不由人 2021-02-01 10:48

In SPSS, it is (relatively) easy to create a cross tab with multiple variables using the factors (or values) as the table heading. So, something like the following (made up dat

相关标签:
7条回答
  • 2021-02-01 10:56

    xtabs has a formula interface that can take some practice to get used to, but this can be done. If you have the data in a dataframe df and your variables are called ques and resp, you can use:

    xtabs(~ques+resp,data=df)
    

    For example:

    > t1 <- rep(c("A","B","C"),5)
    > t2 <- rpois(15,4)
    > df <- data.frame(ques=t1,resp=t2)
    > xtabs(~ques+resp,data=df)
         resp
    names 2 3 4 5 6 7 9
        A 1 0 2 1 0 0 1
        B 1 0 0 2 1 1 0
        C 1 2 0 1 0 1 0
    
    0 讨论(0)
  • 2021-02-01 10:56

    You could use a custom function to use rbind() on several tables, something like this:

    multitab <- function(...){
       tabs<-list(...)
       tablist<-lapply(tabs,table)
       bigtab<-t(sapply(tablist,rbind))
       bigtab } 
    
    0 讨论(0)
  • 2021-02-01 10:59

    Modifying a previous example

    library(Hmisc)
    library(plyr)
    dd <- data.frame(q1=sample(1:3, 20, replace=T),
     q2=sample(1:3, 20, replace=T), 
     q3=sample(1:3, 20, replace=T))  #fake data
    
    cross <- ldply(describe(dd), function(x) x$values[1,])[-1]
    
    rownames(cross) <- c("Q1. Likes it","Q2. Recommends it","Q3. Used it")
    names(cross) <- c("1 (very Often)","2 (Rarely)","3 (Never)")
    

    Now cross looks like this

    > cross
                      1 (very Often) 2 (Rarely) 3 (Never)
    Q1. Likes it                   4         10         6
    Q2. Recommends it              7          9         4
    Q3. Used it                    6          4        10
    
    0 讨论(0)
  • 2021-02-01 11:10

    just check Hadley Wickham's reshape package. AFAIS, you need cast function from the package.

    0 讨论(0)
  • 2021-02-01 11:13

    The underlying issue is that this data is not in tidy format. Crosstabbing multiple variables will be easier when the data is reshaped into "long" form. We can do that with gather from the tidyr package.

    After reshaping, many crosstab functions will work; I'll use tabyl from the janitor package (since - full disclosure - I maintain that package and built the function for this purpose).

    # Create reproducible sample data
    set.seed(1)
    possible_values <- c("1 (Very Often)", "2 (Rarely)", "3 (Never)")
    some_values <- sample(possible_values, 100, replace = TRUE)
    dat <- data.frame(Q1 = some_values[1:25], Q2 = some_values[26:50], 
                     Q3 = some_values[51:75], Q4 = some_values[76:100])
    
    library(tidyr)
    library(janitor)
    
    dat %>%
      gather(question, response) %>% 
      tabyl(question, response)
    #>   question 1 (Very Often) 2 (Rarely) 3 (Never)
    #> 1       Q1              8          8         9
    #> 2       Q2              4         11        10
    #> 3       Q3              8         12         5
    #> 4       Q4              7          7        11
    

    From there, you can format with functions like janitor::adorn_percentages().

    0 讨论(0)
  • 2021-02-01 11:17

    The Hmisc package has the summary.formula function that can do something along the lines you want. It is very flexible, so look at the help page for examples, but here is an application to your problem:

    library(Hmisc)
    dd <- data.frame(Q1=sample(1:3, 20, replace=T), Q2=sample(1:3, 20, replace=T), 
                     Q3=sample(1:3, 20, replace=T))  #fake data
    summary(~Q1+Q2+Q3, data=dd, fun=table)
    

    This gives the following result:

     Descriptive Statistics  (N=20)
    
     +------+-------+
     |      |       |
     +------+-------+
     |Q1 : 1|25% (5)|
     +------+-------+
     |    2 |45% (9)|
     +------+-------+
     |    3 |30% (6)|
     +------+-------+
     |Q2 : 1|30% (6)|
     +------+-------+
     |    2 |35% (7)|
     +------+-------+
     |    3 |35% (7)|
     +------+-------+
     |Q3 : 1|35% (7)|
     +------+-------+
     |    2 |30% (6)|
     +------+-------+
     |    3 |35% (7)|
     +------+-------+
    

    The possible values are given in rows, because it has the flexibility of different sets of values for different variables. You might be able to play with the function parameters (like method and fun) to get the other direction.

    0 讨论(0)
提交回复
热议问题