get lhs object name when piping with dplyr

后端 未结 5 2128
遥遥无期
遥遥无期 2021-02-13 15:43

I\'d like to have a function that can use pipe operator as exported from dplyr. I am not using magrittr.

df %>% my_function

How can I get df

5条回答
  •  一向
    一向 (楼主)
    2021-02-13 16:18

    The SO answer that JBGruber links to in the comments mostly solves the problem. It works by moving upwards through execution environments until a certain variable is found, then returns the lhs from that environment. The only thing missing is the requirement that the function outputs both the name of the original data frame and the manipulated data – I gleaned the latter requirement from one of the OP's comments. For that we just need to output a list containing these things, which we can do by modifying MrFlick's answer:

    get_orig_name <- function(df){
        i <- 1
        while(!("chain_parts" %in% ls(envir=parent.frame(i))) && i < sys.nframe()) {
            i <- i+1
        }
        list(name = deparse(parent.frame(i)$lhs), output = df)
    }
    

    Now we can run get_orig_name to the end of any pipeline to the get the manipulated data and the original data frame's name in a list. We access both using $:

    mtcars %>% summarize_all(mean) %>% get_orig_name
    
    #### OUTPUT ####
    
    $name
    [1] "mtcars"
    
    $output
           mpg    cyl     disp       hp     drat      wt     qsec     vs      am   gear   carb
    1 20.09062 6.1875 230.7219 146.6875 3.596563 3.21725 17.84875 0.4375 0.40625 3.6875 2.8125
    

    I should also mention that, although I think the details of this strategy are interesting, I also think it is needlessly complicated. It sounds like the OP's goal is to manipulate the data and then write it to a file with the same name as the original, unmanipulated, data frame, which can easily be done using more straightforward methods. For example, if we are dealing with multiple data frames we can just do something like the following:

    df_list <- list(mtcars = mtcars, iris = iris)
    
    for(name in names(df_list)){
        df_list[[name]] %>% 
            group_by_if(is.factor) %>%
            summarise_all(mean) %>% 
            write.csv(paste0(name, ".csv"))
    }
    

提交回复
热议问题