How can I change the name of a data frame

前端 未结 2 1229
我寻月下人不归
我寻月下人不归 2021-02-03 10:49

I have a recurrent situation where I set a value at the top of a long set of R code that\'s used in subsetting one or more data frames. Something like this:

cit         


        
相关标签:
2条回答
  • 2021-02-03 11:37

    Generally, you shouldn't be programmatically generating names for data frames in your global environment. This is a good indication that you should be using list to make your life simpler. See the FAQ How to make a list of data frames? for many examples and more discussion.

    Using your concrete example, I would rewrite it in one of a few different ways.

    library(dplyr)
    gear_code <- 4
    gear_subset <- paste("mtcars_", gear_code, sep = "")
    mtcars_subset <- mtcars %>% filter(gear == gear_code)
    head(mtcars_subset)
    write.csv(mtcars_subset, file = paste(gear_subset, ".csv", sep = ""))
    

    The goal seems to be to write a CSV called gear_X.csv that has the mtcars subset with gear == X. You don't to keep an intermediate data frame around, this should be fine:

    gear_code <- 4
    mtcars %>% filter(gear == gear_code) %>%
        write.csv(file = paste0('mtcars_', gear_code, '.csv'))
    

    But probably you're coding it this way because you want to do it for each value of gear, and this is where dplyr's group_by helps:

    CSVs for all the gears

    mtcars %>% group_by(gear) %>%
      do(csv = write.csv(file = sprintf("mt_gear_%s.csv", .[1, "gear"]), x = .)
    

    Data frames for each gear level:

    If you really want individual data frame objects for each gear level, keeping them in a list is the way to go.

    gear_df = split(mtcars, mtcars$gear)
    

    This gives you a list of three data frames, one for each level of gear. And they are named with the levels already, so to see the data frame with all the gear == 4 rows, do

    gear_df[["4"]]
    

    Generally, this easier to work with than three data frames floating around. Anything you want to do to all of the data frames you can do at the same time with a single lapply, and even if you want to use a for loop it's simpler than eval(parse()) or get().

    0 讨论(0)
  • 2021-02-03 11:51

    The truth is that objects in R don't have names per-se. There exists different kinds of environments, including a global one for every process. These environments have lists of names, that point to various objects. Two different names can point to the same object. This is best explained to my knowledge in the environments chapter of Hadley Wickhams Advanced R book http://adv-r.had.co.nz/Environments.html

    So there is no way to change a name of a data frame, because there is nothing to change.

    But you can make a new name (like newname) point to the same object (in your case a data frame object) as an given name (like oldname) simply by doing:

       newname <- oldname
    

    Note that if you change one of these variables a new copy will be made and the internal references will no longer be the same. This is due to R's "Copy on modify" semantics. See this post for an explanation: What exactly is copy-on-modify semantics in R, and where is the canonical source?

    Hope that helps. I know the pain. Dynamic and functional languages are different than static and procedural languages...

    Of course it is possible to calculate a new name for a dataframe and register it in the environment with the assign command - and perhaps you are looking for this. However referring to it afterwards would be rather convoluted.

    Example (assuming df is the dataframe in question):

       assign(  paste("city_stats", city_code, sep = ""), df )
    

    As always see the help for assign for more information http://stat.ethz.ch/R-manual/R-devel/library/base/html/assign.html

    Edit: In reply to your edit, and various comments around the problems with using eval(parse(...) you could parse the name like this:

    head(get(gear_subset))
    
    0 讨论(0)
提交回复
热议问题