How to group by a fixed number of rows in dplyr?

前端 未结 2 529
北海茫月
北海茫月 2021-01-12 21:11

I have a data frame:

set.seed(123)
x <- sample(10)
y <- x^2
my.df <- data.frame(x, y)

The result is this:

> my.         


        
相关标签:
2条回答
  • 2021-01-12 21:30

    We can use rep or gl to create the grouping variable

    library(dplyr)
    my.df %>% 
        group_by(grp = as.integer(gl(n(), 5, n()))) %>% 
        #or with rep
        # group_by(grp = rep(row_number(), length.out = n(), each = 5)) 
        summarise(sum = sum(y), mean = mean(y))
    # A tibble: 2 x 3
    #    grp   sum  mean
    #  <int> <dbl> <dbl>
    #1     1   174  34.8
    #2     2   211  42.2
    
    0 讨论(0)
  • 2021-01-12 21:55

    Another option could be:

    my.df %>%
     group_by(x = ceiling(row_number()/5)) %>%
     summarise_all(list(sum = sum, mean = mean))
    
          x   sum  mean
      <dbl> <dbl> <dbl>
    1     1   174  34.8
    2     2   211  42.2
    
    0 讨论(0)
提交回复
热议问题