Naming conflicts in R when using attach

前端 未结 2 1172
旧时难觅i
旧时难觅i 2021-01-16 12:28

I feel as if constantly in R, I get weird naming conflicts between attached dataframes and other objects, attaches/detaches not working as expected (just had two copies of t

相关标签:
2条回答
  • 2021-01-16 12:37

    attaches/detaches (sic) not working as expected

    As mentioned by joran and BondedDust, using attach is always a bad idea, because it causes silly, obscure bugs like you found.

    naming dataframes with single letters

    Don't do this either! Give you variables meaningful names, so that your code is easier to understand when you come back to it six months later.


    If your problem is that you don't like repeatedly typing the name of a data frame to access columns, then use functions with special evaluation that avoid that need.

    For example,

    some_sample_data <- data.frame(x = 1:10, y = runif(10))
    

    Subsetting

    Repeated typing, hard work:

    some_sample_data[some_sample_data$x > 3 & some_sample_data$y > 0.5, ]
    

    Easier alternative using subset:

    subset(some_sample_data, x > 3 & y > 0.5)
    

    Reordering

    Repeated typing, hard work:

    order_y <- order(some_sample_data$y)
    some_sample_data[order_y, ]
    

    Easier using arrange from plyr:

    arrange(some_sample_data, y)
    

    Transforming

    Repeated typing, hard work:

    some_sample_data$z <- some_sample_data$x + some_sample_data$y
    

    Easier using with, within or mutate (the last one from plyr):

    some_sample_data$z <- with(some_sample_data, x + y)
    some_sample_data <- within(some_sample_data, z <- x + y)
    some_sample_data <- mutate(some_sample_data, z = x + y)
    

    Modelling

    As mentioned by MrFlick, many functions, particularly modelling functions, have a data argument that lets you avoid repeating the data name.

    Repeated typing, hard work:

    lm(some_sample_data$y ~ some_sample_data$x)
    

    Using a data argument:

    lm(y ~ x, data = some_sample_data)
    

    You can see all the functions in the stats package that have a data argument using:

    library(sig)
    stats_sigs <- list_sigs(pkg2env(stats))
    Filter(function(fn) "data" %in% names(fn$args), stats_sigs)
    
    0 讨论(0)
  • 2021-01-16 13:01

    It is better to use a new environment for a series of data. For example, I normally create an e environment with this command.

    e <- new.env()
    

    Then you can access the individuals in the environment with e$your_var.

    The other benefit:

    1. You can use eapply on the element of environment.
    2. ls(e)
    3. rm(list=e)
    4. It is avoid the conflict between your local variable and function variable that you want to create 5 ...
    0 讨论(0)
提交回复
热议问题