I feel as if constantly in R, I get weird naming conflicts between attached dataframes and other objects, attaches/detaches not working as expected (just had two copies of t
attaches/detaches (sic) not working as expected
As mentioned by joran and BondedDust, using attach is always a bad idea, because it causes silly, obscure bugs like you found.
naming dataframes with single letters
Don't do this either! Give you variables meaningful names, so that your code is easier to understand when you come back to it six months later.
If your problem is that you don't like repeatedly typing the name of a data frame to access columns, then use functions with special evaluation that avoid that need.
For example,
some_sample_data <- data.frame(x = 1:10, y = runif(10))
Subsetting
Repeated typing, hard work:
some_sample_data[some_sample_data$x > 3 & some_sample_data$y > 0.5, ]
Easier alternative using subset:
subset(some_sample_data, x > 3 & y > 0.5)
Reordering
Repeated typing, hard work:
order_y <- order(some_sample_data$y)
some_sample_data[order_y, ]
Easier using arrange from plyr
:
arrange(some_sample_data, y)
Transforming
Repeated typing, hard work:
some_sample_data$z <- some_sample_data$x + some_sample_data$y
Easier using with, within or mutate (the last one from plyr
):
some_sample_data$z <- with(some_sample_data, x + y)
some_sample_data <- within(some_sample_data, z <- x + y)
some_sample_data <- mutate(some_sample_data, z = x + y)
Modelling
As mentioned by MrFlick, many functions, particularly modelling functions, have a data
argument that lets you avoid repeating the data name.
Repeated typing, hard work:
lm(some_sample_data$y ~ some_sample_data$x)
Using a data argument:
lm(y ~ x, data = some_sample_data)
You can see all the functions in the stats package that have a data argument using:
library(sig)
stats_sigs <- list_sigs(pkg2env(stats))
Filter(function(fn) "data" %in% names(fn$args), stats_sigs)
It is better to use a new environment for a series of data. For example, I normally create an e
environment with this command.
e <- new.env()
Then you can access the individuals in the environment with e$your_var
.
The other benefit:
eapply
on the element of environment.