dplyr::select() with some variables that may not exist in the data frame?

旧城冷巷雨未停 提交于 2019-11-27 05:54:42

问题


I have a helper function (say foo()) that will be run on various data frames that may or may not contain specified variables. Suppose I have

library(dplyr)
d1 <- data_frame(taxon=1,model=2,z=3)
d2 <- data_frame(taxon=2,pss=4,z=3)

The variables I want to select are

vars <- intersect(names(data),c("taxon","model","z"))

that is, I'd like foo(d1) to return the taxon, model, and z columns, while foo(d2) returns just taxon and z.

If foo contains select(data,c(taxon,model,z)) then foo(d2) fails (because d2 doesn't contain model). If I use select(data,-pss) then foo(d1) fails similarly.

I know how to do this if I retreat from the tidyverse (just return data[vars]), but I'm wondering if there's a handy way to do this either (1) with a select() helper of some sort (tidyselect::select_helpers) or (2) with tidyeval (which I still haven't found time to get my head around!)


回答1:


Another option is select_if:

d2 %>% select_if(names(.) %in% c('taxon', 'model', 'z'))

# # A tibble: 1 x 2
#   taxon     z
#   <dbl> <dbl>
# 1     2     3



回答2:


You can use one_of(), which gives a warning when the column is absent but otherwise selects the correct columns:

d1 %>%
    select(one_of(c("taxon", "model", "z")))
d2 %>%
    select(one_of(c("taxon", "model", "z")))



回答3:


Using the builtin anscombe data frame for the example noting that z is not a column in anscombe :

anscombe %>% select(intersect(names(.), c("x1", "y1", "z")))

giving:

   x1    y1
1  10  8.04
2   8  6.95
3  13  7.58
4   9  8.81
5  11  8.33
6  14  9.96
7   6  7.24
8   4  4.26
9  12 10.84
10  7  4.82
11  5  5.68


来源:https://stackoverflow.com/questions/51529294/dplyrselect-with-some-variables-that-may-not-exist-in-the-data-frame

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!