I have a number of columns that I would like to remove from a data frame. I know that we can delete them individually using something like:
df$x <- NULL
<
Beyond select(-one_of(drop_col_names))
demonstrated in earlier answers, there are a couple other dplyr
options for dropping columns using select()
that do not involve defining all the specific column names (using the dplyr starwars sample data for some variety in column names):
library(dplyr)
starwars %>%
select(-(name:mass)) %>% # the range of columns from 'name' to 'mass'
select(-contains('color')) %>% # any column name that contains 'color'
select(-starts_with('bi')) %>% # any column name that starts with 'bi'
select(-ends_with('er')) %>% # any column name that ends with 'er'
select(-matches('^f.+s$')) %>% # any column name matching the regex pattern
select_if(~!is.list(.)) %>% # not by column name but by data type
head(2)
# A tibble: 2 x 2
homeworld species
1 Tatooine Human
2 Tatooine Droid
If you need to drop a column that may or may not exist in the data frame, here's a slight twist using select_if()
that unlike using one_of()
will not throw an Unknown columns:
warning if the column name does not exist. In this example 'bad_column' is not a column in the data frame:
starwars %>%
select_if(!names(.) %in% c('height', 'mass', 'bad_column'))