How to group a vector into a list of vectors?

后端 未结 4 2179
借酒劲吻你
借酒劲吻你 2020-12-06 19:13

I have some data which looks like this (fake data for example\'s sake):

dressId        color 
6              yellow 
9              red
10             green         


        
相关标签:
4条回答
  • 2020-12-06 19:29

    split.data.frame is a good way to organize this; then extract the color component.

    d <- data.frame(dressId=c(6,9,10,10,10,12,12),
                   color=factor(c("yellow","red","green",
                                  "purple","yellow",
                                  "purple","red"),
                     levels=c("red","orange","yellow",
                              "green","blue","purple")))
    

    I think the version you want is actually this:

    ss <- split.data.frame(d,d$dressId)
    

    You can get something more like the list you requested by extracting the color component:

    lapply(ss,"[[","color")
    
    0 讨论(0)
  • 2020-12-06 19:38

    I am afraid that the answer should be a little different, you should use the following code to accomplish the requested behaviour

    df %>%
    group_by(dressId) %>%
    summarize(colors = toString(unique(color)))
    
    0 讨论(0)
  • 2020-12-06 19:46

    Assuming your data frame is saved in a variable called df, then you can use simply group_by and summarize with list function of dplyr package like this

    library('dplyr')
    
    df %>%
      group_by(dressId) %>%
      summarize(colors = list(color))
    

    Applied to your example:

    df <- tribble(
      ~dressId, ~color,
             6, 'yellow',
             9, 'red',
            10, 'green',
            10, 'purple',
            10, 'yellow',
            12, 'purple',
            12, 'red'
    )
    
    df %>%
      group_by(dressId) %>%
      summarize(colors = list(color))
    
    # dressId                colors
    #       6                yellow
    #       9                   red
    #      10 green, purple, yellow
    #      12           purple, red
    
    0 讨论(0)
  • 2020-12-06 19:49

    In addition to split, you should consider aggregate. Use c or I as the aggregation function to get your list column:

    out <- aggregate(color ~ dressId, mydf, c)
    out
    #   dressId                 color
    # 1       6                yellow
    # 2       9                   red
    # 3      10 green, purple, yellow
    # 4      12           purple, red
    str(out)
    # 'data.frame': 4 obs. of  2 variables:
    #  $ dressId: int  6 9 10 12
    #  $ color  :List of 4
    #   ..$ 0: chr "yellow"
    #   ..$ 1: chr "red"
    #   ..$ 2: chr  "green" "purple" "yellow"
    #   ..$ 3: chr  "purple" "red"
    out$color
    # $`0`
    # [1] "yellow"
    # 
    # $`1`
    # [1] "red"
    # 
    # $`2`
    # [1] "green"  "purple" "yellow"
    # 
    # $`3`
    # [1] "purple" "red" 
    

    Note: This works even if the "color" variable is a factor, as in Ben's sample data (I missed that point when I posted the answer above) but you need to use I as the aggregation function instead of c:

    out <- aggregate(color ~ dressId, d, I)
    str(out)
    # 'data.frame': 4 obs. of  2 variables:
    #  $ dressId: num  6 9 10 12
    #  $ color  :List of 4
    #   ..$ 0: Factor w/ 6 levels "red","orange",..: 3
    #   ..$ 1: Factor w/ 6 levels "red","orange",..: 1
    #   ..$ 2: Factor w/ 6 levels "red","orange",..: 4 6 3
    #   ..$ 3: Factor w/ 6 levels "red","orange",..: 6 1
    out$color
    # $`0`
    # [1] yellow
    # Levels: red orange yellow green blue purple
    # 
    # $`1`
    # [1] red
    # Levels: red orange yellow green blue purple
    # 
    # $`2`
    # [1] green  purple yellow
    # Levels: red orange yellow green blue purple
    # 
    # $`3`
    # [1] purple red   
    # Levels: red orange yellow green blue purple
    

    Strangely, however, the default display shows the integer values:

    out
    #   dressId   color
    # 1       6       3
    # 2       9       1
    # 3      10 4, 6, 3
    # 4      12    6, 1
    
    0 讨论(0)
提交回复
热议问题