When to use missing versus NULL values for passing undefined function arguments in R, and why?

前端 未结 3 1640
梦谈多话
梦谈多话 2021-01-01 23:07

To date when writing R functions I\'ve passed undefined arguments as NULL values and then tested whether they are NULL i.e.

f1 <- function (x = NULL) {
           


        
相关标签:
3条回答
  • 2021-01-01 23:31

    In my opinion, it is not clear when the limitation to missing applies. The documentation, as you quote, says that missing can only be used in the immediate body of the function. A simple example, though, shows that that is not the case and that it works as expected when the arguments are passed to a nested function.

    f1 = function(x, y, z){
      if(!missing(x))
        print(x)
      if(!missing(y))
        print(y)
    }
    
    f2 = function(x, y, z){
      if(!missing(z)) print(z)
      f1(x, y)
    }
    f1(y="2")
    #> [1] "2"
    f2(y="2", z="3")
    #> [1] "3"
    #> [1] "2"
    f2(x="1", z="3")
    #> [1] "3"
    #> [1] "1"
    

    I would like to see an example of a case when missing does not work in a nested function.

    Created on 2019-09-30 by the reprex package (v0.2.1)

    0 讨论(0)
  • 2021-01-01 23:37

    NULL is just another value you can assign to a variable. It's no different than any other default value you'd assign in your function's declaration.

    missing on the other hand checks if the user supplied that argument, which you can do before the default assignment - which thanks to R's lazy evaluation only happens when that variable is used.

    A couple of examples of what you can achieve with this are: arguments with no default value that you can still omit - e.g. file and text in read.table, or arguments with default values where you can only specify one - e.g. n and nmax in scan.

    You'll find many other use cases by browsing through R code.

    0 讨论(0)
  • 2021-01-01 23:47

    missing(x) seems to be a bit faster than using default arg to x equal to NULL.

    > require('microbenchmark')
    > f1 <- function(x=NULL) is.null(x)
    > f2 <- function(x) missing(x)
    
    > microbenchmark(f1(1), f2(1))
    Unit: nanoseconds
      expr min  lq median    uq  max neval
     f1(1) 615 631  647.5 800.5 3024   100
     f2(1) 497 511  567.0 755.5 7916   100
    
    > microbenchmark(f1(), f2())
    Unit: nanoseconds
     expr min  lq median    uq  max neval
     f1() 589 619    627 745.5 3561   100
     f2() 437 448    463 479.0 2869   100
    

    Note that in the f1 case x is still reported as missing if you make a call f1(), but it has a value that may be read within f1.

    The second case is more general than the first one. missing() just means that the user did not pass any value. is.null() (with NULL default arg) states that the user either did not pass anything or he/she passed NULL.

    By the way, plot.default() and chisq.test() use NULL for their second arguments. On the other hand, getS3method('t.test', 'default') uses NULL for y argument and missing() for mu (in order to be prepared for many usage scenarios).

    I think that some R users will prefer f1-type functions, especially when working with the *apply family:

    sapply(list(1, NULL, 2, NULL), f1)
    

    Achieving that in the f2 case is not so straightforward.

    0 讨论(0)
提交回复
热议问题