Split a data frame column containing a list into multiple columns using dplyr (or otherwise)

后端 未结 4 1609
面向向阳花
面向向阳花 2020-12-18 00:43

Consider the following example data

library(dplyr)
tmp <- mtcars %>%
    group_by(cyl) %>%
    summarise(mpg_sum = list(summary(mpg)))
相关标签:
4条回答
  • 2020-12-18 01:18

    As commented, you can also use the tidy function from package broom:

    library(broom)
    mtcars %>% group_by(cyl) %>% do(tidy(summary(.$mpg)))
    # Source: local data frame [3 x 7]
    # Groups: cyl [3]
    # 
    #     cyl minimum    q1 median  mean    q3 maximum
    #   (dbl)   (dbl) (dbl)  (dbl) (dbl) (dbl)   (dbl)
    # 1     4    21.4 22.80   26.0 26.66 30.40    33.9
    # 2     6    17.8 18.65   19.7 19.74 21.00    21.4
    # 3     8    10.4 14.40   15.2 15.10 16.25    19.2
    
    0 讨论(0)
  • 2020-12-18 01:23

    (or otherwise) option using sapply():

    t(sapply(split(mtcars$mpg, mtcars$cyl), summary))
    
    0 讨论(0)
  • 2020-12-18 01:25

    We can use data.table. Convert the 'data.frame' to 'data.table' (as.data.table(mtcars)), grouped by 'cyl', we get the summary of 'mpg' and convert it to list

    library(data.table)
    as.data.table(mtcars)[, as.list(summary(mpg)), by = cyl]
    #    cyl Min. 1st Qu. Median  Mean 3rd Qu. Max.
    #1:   6 17.8   18.65   19.7 19.74   21.00 21.4
    #2:   4 21.4   22.80   26.0 26.66   30.40 33.9
    #3:   8 10.4   14.40   15.2 15.10   16.25 19.2
    

    Or using only dplyr, after grouping by 'cyl', we use do to do the same operation as above.

    library(dplyr)
    mtcars %>%
         group_by(cyl) %>%
         do(data.frame(as.list(summary(.$mpg)), check.names=FALSE) )
    #   cyl  Min. 1st Qu. Median  Mean 3rd Qu.  Max.
    #  <dbl> <dbl>   <dbl>  <dbl> <dbl>   <dbl> <dbl>
    #1     4  21.4   22.80   26.0 26.66   30.40  33.9
    #2     6  17.8   18.65   19.7 19.74   21.00  21.4
    #3     8  10.4   14.40   15.2 15.10   16.25  19.2
    

    Or using purrr

    library(purrr)
    mtcars %>% 
         slice_rows("cyl") %>% 
         select(mpg) %>%
         by_slice(dmap, summary, .collate= "cols")
    
    0 讨论(0)
  • 2020-12-18 01:38

    Another option

    with(data = mtcars,by(mpg,cyl,FUN = summary))
    
    0 讨论(0)
提交回复
热议问题