Adding prefix or suffix to most data.frame variable names in piped R workflow

前端 未结 6 1697
一个人的身影
一个人的身影 2020-12-03 05:07

I want to add a suffix or prefix to most variable names in a data.frame, typically after they\'ve all been transformed in some way and before performing a join. I don\'t ha

相关标签:
6条回答
  • 2020-12-03 05:39

    You can pass functions to rename_at, so do

     means14 <- dat14 %>%
      group_by(class) %>%
      select(-ID) %>%
      summarise_all(funs(mean(.))) %>% 
      rename_at(vars(-class),function(x) paste0(x,"_2014"))
    
    0 讨论(0)
  • 2020-12-03 05:43

    This is more of a step back, but you might think of reshaping your data in order to apply the function to multiple years at the same time. This will preserve tidyness. If you're going to want to end up comparing different years, it might make sense to have the year be a separate variable in a dataframe, rather than storing the year in the names. You should be able to use summarise_ to get the mean_year behavior. See http://cran.r-project.org/web/packages/dplyr/vignettes/nse.html

    library(dplyr)
    library(tidyr)
    set.seed(1)
    dat14 <- data.frame(ID = 1:10, speed = runif(10), power = rpois(10, 1),
                        force = rexp(10), class = rep(c("a", "b"),5))
    
    dat14 %>% 
      gather(variable, value, -ID, -class) %>% 
      mutate(year = 2014) %>% 
      group_by(class, year, variable)%>% 
      summarise(mean = mean(value))`
    
    0 讨论(0)
  • 2020-12-03 05:46

    As of February 2017 you can do this with the dplyr command rename_(...).

    In the case of this example you could do.

    dat14 %>%
      group_by(class) %>%
      select(-ID) %>%
      summarise_each(funs(mean(.))) %>%
      rename_(names(.)[-1], paste0(names(.)[-1],"_mean_2014"))) 
    

    This is rather similar to the answer with set_names but works with tibbles too!

    0 讨论(0)
  • 2020-12-03 06:00

    After additional experimenting since posting this question, I've found that the setNames function will work with the piping as it returns a data.frame:

    dat14 %>%
      group_by(class) %>%
      select(-ID) %>%
      summarise_each(funs(mean(.))) %>%
      setNames(c(names(.)[1], paste0(names(.)[-1],"_mean_2014"))) 
    
      class speed_mean_2014 power_mean_2014 force_mean_2014
    1     a       0.5572500             0.8       0.5519802
    2     b       0.2850798             0.6       1.0888116
    
    0 讨论(0)
  • 2020-12-03 06:03

    While Sam Firkes solution using setNames() ist certainly the only solution keeping an unbroken pipe, it will not work with the tbl objects from dplyr, since the column names are not accessible by methods from the usual base R naming functions. Here is a function that you can use within a pipe with tbl objects as well, thanks to this solution by hrbrmstr. It adds predefined prefixes and suffixes at the specified column indices. Default is all columns.

    tbl.renamer <- function(tbl,prefix="x",suffix=NULL,index=seq_along(tbl_vars(tbl))){
      newnames <- tbl_vars(tbl) # Get old variable names
      names(newnames) <- newnames
      names(newnames)[index] <- paste0(prefix,".",newnames,suffix)[index] # create a named vector for .dots
      rename_(tbl,.dots=newnames) # rename the variables
    }
    

    Example usage (Assume auth_users beeing an tbl_sql object):

    auth_user %>% tbl_vars
    tbl.renamer(auth_user) %>% tbl_vars
    auth_user %>% tbl.renamer %>% tbl_vars
    auth_user %>% tbl.renamer(index = c(1,5)) %>% tbl_vars
    
    0 讨论(0)
  • 2020-12-03 06:04

    This is a bit quicker, but not totally what you want:

    dat14 %>%
      group_by(class) %>%
      select(-ID) %>%
      summarise_each(funs(mean(.))) -> means14 
    
    names(means14)[-1] %<>% paste0("_mean_2014")
    

    if you haven't used the %<>%-operator before definitely check this link out, its a super-useful tool.

    you can also use it for recomputing or rounding some columns, like this df$meancolumn %<>% round() , and so on, it just comes up very often and just saves you a lot of writing

    0 讨论(0)
提交回复
热议问题