How NOT to select columns using select() dplyr when you have character vector of colnames?

前端 未结 1 1943
孤街浪徒
孤街浪徒 2021-02-05 07:54

I am trying to unselect columns in my dataset using dplyr, but I am not able to achieve that since last night.

I am well aware of work around but I am being strictly try

相关标签:
1条回答
  • 2021-02-05 08:34

    Edit: OP's actual question was about how to use a character vector to select or deselect columns from a dataframe. Use the one_of() helper function for that:

    colnames(iris)
    
    # [1] "Sepal.Length" "Sepal.Width"  "Petal.Length" "Petal.Width"  "Species"
    
    cols <- c("Petal.Length", "Sepal.Length")
    
    select(iris, one_of(cols)) %>% colnames
    
    # [1] "Petal.Length" "Sepal.Length"
    
    select(iris, -one_of(cols)) %>% colnames
    
    # [1] "Sepal.Width" "Petal.Width" "Species"
    

    You should have a look at the select helpers (type ?select_helpers) because they're incredibly useful. From the docs:

    starts_with(): starts with a prefix

    ends_with(): ends with a prefix

    contains(): contains a literal string

    matches(): matches a regular expression

    num_range(): a numerical range like x01, x02, x03.

    one_of(): variables in character vector.

    everything(): all variables.


    Given a dataframe with columns names a:z, use select like this:

    select(-a, -b, -c, -d, -e)
    
    # OR
    
    select(-c(a, b, c, d, e))
    
    # OR
    
    select(-(a:e))
    
    # OR if you want to keep b
    
    select(-a, -(c:e))
    
    # OR a different way to keep b, by just putting it back in
    
    select(-(a:e), b)
    

    So if I wanted to omit two of the columns from the iris dataset, I could say:

    colnames(iris)
    
    # [1] "Sepal.Length" "Sepal.Width"  "Petal.Length" "Petal.Width"  "Species"
    
    select(iris, -c(Sepal.Length, Petal.Length)) %>% colnames()
    
    # [1] "Sepal.Width" "Petal.Width" "Species" 
    

    But of course, the best and most concise way to achieve that is using one of select's helper functions:

    select(iris, -ends_with(".Length")) %>% colnames()
    
    # [1] "Sepal.Width" "Petal.Width" "Species"   
    

    P.S. It's weird that you are passing quoted values to dplyr, one of its big niceties is that you don't have to keep typing out quotes all the time. As you can see, bare values work fine with dplyr and ggplot2.

    0 讨论(0)
提交回复
热议问题