Is it possible to set na.rm to TRUE globally?

前端 未结 4 1229
情歌与酒
情歌与酒 2020-12-05 13:56

For commands like max the option na.rm is set by default to FALSE. I understand why this is a good idea in general, but I\'d like to t

相关标签:
4条回答
  • 2020-12-05 14:48

    One workaround (dangerous), is to do the following :

    1. List all functions that have na.rm as argument. Here I limited my search to the base package.
    2. Fetch each function and add this line at the beginning of its body: na.rm = TRUE
    3. Assign the function back to the base package.

    So first I store in a list (ll) all functions having na.rm as argument:

    uses_arg <- function(x,arg) 
      is.function(fx <- get(x)) && 
      arg %in% names(formals(fx))
    basevals <- ls(pos="package:base")      
    na.rm.f <- basevals[sapply(basevals,uses_arg,'na.rm')]
    

    EDIT better method to get all na.rm's argument functions (thanks to mnel comment)

    Funs <- Filter(is.function,sapply(ls(baseenv()),get,baseenv()))
    na.rm.f <- names(Filter(function(x) any(names(formals(args(x)))%in% 'na.rm'),Funs))
    

    So na.rm.f list looks like:

     [1] "all"                     "any"                     "colMeans"                "colSums"                
     [5] "is.unsorted"             "max"                     "mean.default"            "min"                    
     [9] "pmax"                    "pmax.int"                "pmin"                    "pmin.int"               
    [13] "prod"                    "range"                   "range.default"           "rowMeans"               
    [17] "rowsum.data.frame"       "rowsum.default"          "rowSums"                 "sum"                    
    [21] "Summary.data.frame"      "Summary.Date"            "Summary.difftime"        "Summary.factor"         
    [25] "Summary.numeric_version" "Summary.ordered"         "Summary.POSIXct"         "Summary.POSIXlt" 
    

    Then for each function I change the body, the code is inspired from data.table package (FAQ 2.23) that add one line to the start of rbind.data.frame and cbind.data.frame.

    ll <- lapply(na.rm.f,function(x)
      {
      tt <- get(x)
      ss = body(tt)
      if (class(ss)!="{") ss = as.call(c(as.name("{"), ss))
      if(length(ss) < 2) print(x)
      else{
        if (!length(grep("na.rm = TRUE",ss[[2]],fixed=TRUE))) {
          ss = ss[c(1,NA,2:length(ss))]
          ss[[2]] = parse(text="na.rm = TRUE")[[1]]
          body(tt)=ss
          (unlockBinding)(x,baseenv())
          assign(x,tt,envir=asNamespace("base"),inherits=FALSE)
          lockBinding(x,baseenv())
          }
        }
      })
    

    No if you check , the first line of each function of our list :

    unique(lapply(na.rm.f,function(x) body(get(x))[[2]]))
    [[1]]
    na.rm = TRUE
    
    0 讨论(0)
  • 2020-12-05 14:51

    It is not possible to change na.rm to TRUE globally. (See Hong Ooi's comment under the question.)

    EDIT:

    Unfortunately, the answer you don't want is the only one that works generally. There's no global option for this like there is for na.action, which only affects modeling functions like lm, glm, etc (and even there, it isn't guaranteed to work in all cases). – Hong Ooi Jul 2 '13 at 6:23

    0 讨论(0)
  • 2020-12-05 14:54

    For my R package, I overwrote the existing functions mean and sum. Thanks to the great Ben (comments below), I altered my functions to this:

    mean <- function(x, ..., na.rm = TRUE) {
      base::mean(x, ..., na.rm = na.rm)
    }
    

    After this, mean(c(2, NA, 3)) = 2.5 instead of NA.

    And for sum:

    sum <- function(x, ..., na.rm = TRUE) {
      base::sum(x, ..., na.rm = na.rm)
    }
    

    This will yield sum(c(2, NA, 3)) = 5 instead of NA.

    sum(c(2, NA, 3, NaN)) also works.

    0 讨论(0)
  • 2020-12-05 14:55

    There were several answers about changing na.rm argument globally already. I just want to notice about partial() function from purrr or pryr packages. Using this function you can create a copy of existing function with predefined arguments:

    library(purrr)
    .mean <- partial(mean, na.rm = TRUE)
    
    # Create sample vector
    df <- c(1, 2, 3, 4, NA, 6, 7)
    
    mean(df)
    >[1] NA
    
    .mean(df)
    >[1] 3.833333
    

    We can combine this tip with @agstudy answer and create copies of all functions with na.rm = TRUE argument:

    library(purrr)
    
    # Create a vector of function names https://stackoverflow.com/a/17423072/9300556
    Funs <- Filter(is.function,sapply(ls(baseenv()),get,baseenv()))
    na.rm.f <- names(Filter(function(x) any(names(formals(args(x)))%in% 'na.rm'),Funs))
    
    # Create strings. Dot "." is optional
    fs <- lapply(na.rm.f,
                 function(x) paste0(".", x, "=partial(", x ,", na.rm = T)"))
    
    eval(parse(text = fs)) 
    

    So now, there are .all, .min, .max, etc. in our .GlobalEnv. You can run them:

    .min(df)
    > [1] 1
    .max(df)
    > [1] 7
    .all(df)
    > [1] TRUE
    

    To overwrite functions, just remove dot "." from lapply call. Inspired by this blogpost

    0 讨论(0)
提交回复
热议问题