Split data.frame based on levels of a factor into new data.frames

前端 未结 2 1708
一个人的身影
一个人的身影 2020-11-21 16:44

I\'m trying to create separate data.frame objects based on levels of a factor. So if I have:

df <- data.frame(
  x=rnorm(25),
  y=rnorm(25),
         


        
2条回答
  •  無奈伤痛
    2020-11-21 17:41

    Since dplyr 0.8.0 , we can also use group_split which has similar behavior as base::split

    library(dplyr)
    df %>% group_split(g)
    
    #[[1]]
    # A tibble: 5 x 3
    #       x      y g    
    #      
    #1 -1.21  -1.45  A    
    #2  0.506  1.10  A    
    #3 -0.477 -1.17  A    
    #4 -0.110  1.45  A    
    #5  0.134 -0.969 A    
    
    #[[2]]
    # A tibble: 5 x 3
    #       x      y g    
    #      
    #1  0.277  0.575 B    
    #2 -0.575 -0.476 B    
    #3 -0.998 -2.18  B    
    #4 -0.511 -1.07  B    
    #5 -0.491 -1.11  B  
    #....
    

    It also comes with argument .keep (which is TRUE by default) to specify whether or not the grouped column should be kept.

    df %>% group_split(g, .keep = FALSE)
    
    #[[1]]
    # A tibble: 5 x 2
    #       x      y
    #     
    #1 -1.21  -1.45 
    #2  0.506  1.10 
    #3 -0.477 -1.17 
    #4 -0.110  1.45 
    #5  0.134 -0.969
    
    #[[2]]
    # A tibble: 5 x 2
    #       x      y
    #     
    #1  0.277  0.575
    #2 -0.575 -0.476
    #3 -0.998 -2.18 
    #4 -0.511 -1.07 
    #5 -0.491 -1.11 
    #....
    

    The difference between base::split and dplyr::group_split is that group_split does not name the elements of the list based on grouping. So

    df1 <- df %>% group_split(g)
    names(df1) #gives 
    NULL
    

    whereas

    df2 <- split(df, df$g)
    names(df2) #gives
    #[1] "A" "B" "C" "D" "E"
    

    data

    set.seed(1234)
    df <- data.frame(
          x=rnorm(25),
          y=rnorm(25),
          g=rep(factor(LETTERS[1:5]), 5)
    )
    

提交回复
热议问题