Unable to use tidyselect `everything()` in combination with `group_by()` and `fill()`

混江龙づ霸主 提交于 2019-12-07 18:33:42

问题


library(tidyverse)
df <- tibble(x1 = c("A", "A", "A", "B", "B", "B"),
             x2 = c(NA, 8, NA, NA, NA, 5),
             x3 = c(3, 6, 5, 9, 1, 9))
#> # A tibble: 6 x 3
#>   x1       x2    x3
#>   <chr> <dbl> <dbl>
#> 1 A        NA     3
#> 2 A         8    NA
#> 3 A        NA     5
#> 4 B        NA     9
#> 5 B        NA     1
#> 6 B         5     9

I have groups 'A' and 'B' shown in column x1. I need the 'NA' values in columns x2 and x3 to populate only from values within the same group, in the updown direction. That's simple enough, here's the code:

df %>% group_by(x1) %>% fill(c(x2, x3), .direction = "updown")
#> # A tibble: 6 x 3
#>   x1       x2    x3
#>   <chr> <dbl> <dbl>
#> 1 A         8     3
#> 2 A         8     5
#> 3 A         8     5
#> 4 B         5     9
#> 5 B         5     1
#> 6 B         5     9

My real-life issue is that my data frame doesn't contain just columns x1 through x3. It's more like x1 through x100. And the column names are very random, in no logical order. To save myself the trouble of typing all ~100 columns in I tried the tidyselect everything() argument shown below. But that yields an understandable error. I don't know how to work around it.

df %>% group_by(x1) %>% fill(everything(), .direction = "updown")
#> Error: Column `x1` can't be modified because it's a grouping variable

I asked a related question yesterday, about naming exceptions to the everything() argument, was too simple in my approach, and as a consequence caused confusion on the intent on what I wanted to see in a solution. The proposed solution, "you can use select(-variable)", won't work in my case outlined above (I believe). Hence, this new question. What do I do?

I should also mention that simply selecting the numerical column sequence (ie 2:100) won't work because I need to cherry pick some columns out by name (eg x45, x70). And the order of the columns can change month to month, I have to cherry pick by column name. So using everything() with the option of everything_but(column.names = c(x45, x70)) would be what I really want. Does it exist?


回答1:


You can do:

df %>%
 group_by(x1) %>%
 fill(-x1, .direction = "updown")

  x1       x2    x3
  <chr> <dbl> <dbl>
1 A         8     3
2 A         8     6
3 A         8     5
4 B         5     9
5 B         5     1
6 B         5     9

This behavior is documented in the documentation of tidyr (also look at the comment from @Gregor):

You can supply bare variable names, select all variables between x and z with x:z, exclude y with -y.



来源:https://stackoverflow.com/questions/58542413/unable-to-use-tidyselect-everything-in-combination-with-group-by-and-fi

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!