I have a recurrent situation where I set a value at the top of a long set of R code that\'s used in subsetting one or more data frames. Something like this:
cit
Generally, you shouldn't be programmatically generating names for data frames in your global environment. This is a good indication that you should be using list
to make your life simpler. See the FAQ How to make a list of data frames? for many examples and more discussion.
Using your concrete example, I would rewrite it in one of a few different ways.
library(dplyr)
gear_code <- 4
gear_subset <- paste("mtcars_", gear_code, sep = "")
mtcars_subset <- mtcars %>% filter(gear == gear_code)
head(mtcars_subset)
write.csv(mtcars_subset, file = paste(gear_subset, ".csv", sep = ""))
The goal seems to be to write a CSV called gear_X.csv
that has the mtcars
subset with gear == X
. You don't to keep an intermediate data frame around, this should be fine:
gear_code <- 4
mtcars %>% filter(gear == gear_code) %>%
write.csv(file = paste0('mtcars_', gear_code, '.csv'))
But probably you're coding it this way because you want to do it for each value of gear
, and this is where dplyr
's group_by
helps:
mtcars %>% group_by(gear) %>%
do(csv = write.csv(file = sprintf("mt_gear_%s.csv", .[1, "gear"]), x = .)
If you really want individual data frame objects for each gear level, keeping them in a list is the way to go.
gear_df = split(mtcars, mtcars$gear)
This gives you a list
of three data frames, one for each level of gear
. And they are named with the levels already, so to see the data frame with all the gear == 4
rows, do
gear_df[["4"]]
Generally, this easier to work with than three data frames floating around. Anything you want to do to all of the data frames you can do at the same time with a single lapply
, and even if you want to use a for
loop it's simpler than eval(parse())
or get()
.
The truth is that objects in R don't have names per-se. There exists different kinds of environments, including a global one for every process. These environments have lists of names, that point to various objects. Two different names can point to the same object. This is best explained to my knowledge in the environments chapter of Hadley Wickhams Advanced R book http://adv-r.had.co.nz/Environments.html
So there is no way to change a name of a data frame, because there is nothing to change.
But you can make a new name (like newname
) point to the same object (in your case a data frame object) as an given name (like oldname
) simply by doing:
newname <- oldname
Note that if you change one of these variables a new copy will be made and the internal references will no longer be the same. This is due to R's "Copy on modify" semantics. See this post for an explanation: What exactly is copy-on-modify semantics in R, and where is the canonical source?
Hope that helps. I know the pain. Dynamic and functional languages are different than static and procedural languages...
Of course it is possible to calculate a new name for a dataframe and register it in the environment with the assign
command - and perhaps you are looking for this. However referring to it afterwards would be rather convoluted.
Example (assuming df
is the dataframe in question):
assign( paste("city_stats", city_code, sep = ""), df )
As always see the help for assign
for more information http://stat.ethz.ch/R-manual/R-devel/library/base/html/assign.html
Edit:
In reply to your edit, and various comments around the problems with using eval(parse(...)
you could parse the name like this:
head(get(gear_subset))