Achieving the equivalent of rbind using tidyr [duplicate]

问题

I have some data that looks like this.

set.seed(1)
df <- data.frame(group = rep(letters[1:2],each=3),
                 day = rep(1:3,2),
                 var1_mean = round(rnorm(6),2),
                 var1_sd = round(rnorm(6,5),2),
                 var2_mean = round(rnorm(6),2),
                 var2_sd = round(rnorm(6,5),2))

df

# output

# group day var1_mean var1_sd var2_mean var2_sd
#     a   1     -0.63    5.49     -0.62    5.82
#     a   2      0.18    5.74     -2.21    5.59
#     a   3     -0.84    5.58      1.12    5.92
#     b   1      1.60    4.69     -0.04    5.78
#     b   2      0.33    6.51     -0.02    5.07
#     b   3     -0.82    5.39      0.94    3.01

Now here is what I would like it to look like (and the code I used to get there)

library(tidyverse)
rbind(df %>% select(group, day, starts_with("var1")) %>% rename(mean = var1_mean, sd = var1_sd),
      df %>% select(group, day, starts_with("var2")) %>% rename(mean = var2_mean, sd = var2_sd)) %>%
  add_column(var = rep(paste0("var",1:2),each=6), .before = "group")

# output

#   var group day  mean   sd
#  var1     a   1 -0.63 5.49
#  var1     a   2  0.18 5.74
#  var1     a   3 -0.84 5.58
#  var1     b   1  1.60 4.69
#  var1     b   2  0.33 6.51
#  var1     b   3 -0.82 5.39
#  var2     a   1 -0.62 5.82
#  var2     a   2 -2.21 5.59
#  var2     a   3  1.12 5.92
#  var2     b   1 -0.04 5.78
#  var2     b   2 -0.02 5.07
#  var2     b   3  0.94 3.01

Now my code obviously gets the job done but I was wondering if there is some way to use pivot_longer() or some other function to do it less clunkily.

回答1:

We can use pivot_longer where we specify the names_sep as _ and the names_to with ".value" and a grouping name

library(dplyr)
library(tidyr)
df %>% 
    pivot_longer(cols = starts_with('var'), 
       names_to = c('grp', '.value'), names_sep="_")
#   group   day grp    mean    sd
#   <chr> <int> <chr> <dbl> <dbl>
# 1 a         1 var1  -0.63  5.49
# 2 a         1 var2  -0.62  5.82
# 3 a         2 var1   0.18  5.74
# 4 a         2 var2  -2.21  5.59
# 5 a         3 var1  -0.84  5.58
# 6 a         3 var2   1.12  5.92
# 7 b         1 var1   1.6   4.69
# 8 b         1 var2  -0.04  5.78
# 9 b         2 var1   0.33  6.51
#10 b         2 var2  -0.02  5.07
#11 b         3 var1  -0.82  5.39
#12 b         3 var2   0.94  3.01

we could remove the 'grp' column later

df %>% 
    pivot_longer(cols = starts_with('var'), 
       names_to = c('grp', '.value'), names_sep="_") %>%
    select(-grp)

来源：https://stackoverflow.com/questions/63822318/achieving-the-equivalent-of-rbind-using-tidyr

标签

tidyverse