Why is my recursive function so slow in R?

前端 未结 7 1744
栀梦
栀梦 2020-12-02 23:51

The following takes about 30 seconds to run whereas I would expect it to be nearly instant. Is there a problem with my code?

x <- fibonacci(35);

fibonac         


        
相关标签:
7条回答
  • 2020-12-03 00:01

    That just provided a nice opportunity to plug Rcpp which allows us to add C++ functions easily to R.

    So after fixing your code slightly, and using the packages inline (to easily compile, load and link short code snippets as dynamically loadable functions) as well as rbenchmark to time and compare functions, we end up with a stunning 700-fold increase in performance:

    R> print(res)
            test replications elapsed relative user.self sys.self
    2 fibRcpp(N)            1   0.092    1.000      0.10        0
    1    fibR(N)            1  65.693  714.054     65.66        0
    R> 
    

    Here we see elapsed times of 92 milliseonds versus 65 seconds, for a relative ratio of 714. But by now everybody else told you not to do this directly in R.... The code is below.

    ## inline to compile, load and link the C++ code
    require(inline)
    
    ## we need a pure C/C++ function as the generated function
    ## will have a random identifier at the C++ level preventing
    ## us from direct recursive calls
    incltxt <- '
    int fibonacci(const int x) {
       if (x == 0) return(0);
       if (x == 1) return(1);
       return (fibonacci(x - 1)) + fibonacci(x - 2);
    }'
    
    ## now use the snipped above as well as one argument conversion
    ## in as well as out to provide Fibonacci numbers via C++
    fibRcpp <- cxxfunction(signature(xs="int"),
                       plugin="Rcpp",
                       incl=incltxt,
                       body='
       int x = Rcpp::as<int>(xs);
       return Rcpp::wrap( fibonacci(x) );
    ')
    
    ## for comparison, the original (but repaired with 0/1 offsets)
    fibR <- function(seq) {
        if (seq == 0) return(0);
        if (seq == 1) return(1);
        return (fibR(seq - 1) + fibR(seq - 2));
    }
    
    ## load rbenchmark to compare
    library(rbenchmark)
    
    N <- 35     ## same parameter as original post
    res <- benchmark(fibR(N),
                     fibRcpp(N),
                     columns=c("test", "replications", "elapsed",
                               "relative", "user.self", "sys.self"),
                     order="relative",
                     replications=1)
    print(res)  ## show result
    

    And for completeness, the functions also produce the correct output:

    R> sapply(1:10, fibR)
     [1]  1  1  2  3  5  8 13 21 34 55
    R> sapply(1:10, fibRcpp)
     [1]  1  1  2  3  5  8 13 21 34 55
    R> 
    
    0 讨论(0)
  • 2020-12-03 00:08

    A recursive implementation with linear cost:

    fib3 <- function(n){
      fib <- function(n, fibm1, fibm2){
        if(n==1){return(fibm2)}
        if(n==2){return(fibm1)}
        if(n >2){
          fib(n-1, fibm1+fibm2, fibm1)  
        }
      }
    fib(n, 1, 0)  
    }
    

    Comparing with the recursive solution with exponential cost:

    > system.time(fibonacci(35))
      usuário   sistema decorrido 
       14.629     0.017    14.644 
    > system.time(fib3(35))
      usuário   sistema decorrido 
        0.001     0.000     0.000
    

    This solution can be vectorized with ifelse:

    fib4 <- function(n){
        fib <- function(n, fibm1, fibm2){
            ifelse(n<=1, fibm2,
              ifelse(n==2, fibm1,
                Recall(n-1, fibm1+fibm2, fibm1)  
              ))
        }
        fib(n, 1, 0)  
    }
    
    fib4(1:30)
    ##  [1]      0      1      1      2      3      5      8
    ##  [8]     13     21     34     55     89    144    233
    ## [15]    377    610    987   1597   2584   4181   6765
    ## [22]  10946  17711  28657  46368  75025 121393 196418
    ## [29] 317811 514229
    

    The only changes required are changing == to <= for the n==1 case, and changing each if block to the equivalent ifelse.

    0 讨论(0)
  • 2020-12-03 00:10

    :-) because you use exponential algorithm!!! So for fibonacci number N it has to call the function 2^N times, which 2^35, which is heck of a number.... :-)

    Use linear algorithm:

    fib = function (x)
    {
            if (x == 0)
                    return (0)
            n1 = 0
            n2 = 1
            for (i in 1:(x-1)) {
                    sum = n1 + n2
                    n1 = n2
                    n2 = sum
            }
            n2
    }
    

    Sorry, edit: the complexity of the exponential recursive algorithm is not O(2^N) but O(fib(N)), as Martinho Fernandes greatly joked :-) Really a good note :-)

    0 讨论(0)
  • 2020-12-03 00:11

    Patrick Burns gives an example in R Inferno of one way to do memoization in R with local() and <<-. In fact, it's a fibonacci:

    fibonacci <- local({
        memo <- c(1, 1, rep(NA, 100))
        f <- function(x) {
            if(x == 0) return(0)
            if(x < 0) return(NA)
            if(x > length(memo))
            stop("’x’ too big for implementation")
            if(!is.na(memo[x])) return(memo[x])
            ans <- f(x-2) + f(x-1)
            memo[x] <<- ans
            ans
        }
    })
    
    0 讨论(0)
  • 2020-12-03 00:16

    Because you are using one of the worst algorithms in the world!

    Complexity of which is O(fibonacci(n)) = O((golden ratio)^n) and golden ratio is 1.6180339887498948482…

    0 讨论(0)
  • 2020-12-03 00:22

    Because the memoise package was already mentioned here is a reference implementation:

    fib <- function(n) {
      if (n < 2) return(1)
      fib(n - 2) + fib(n - 1)
    }
    system.time(fib(35))
    ##    user  system elapsed 
    ##   36.10    0.02   36.16
    
    library(memoise)
    fib2 <- memoise(function(n) {
      if (n < 2) return(1)
      fib2(n - 2) + fib2(n - 1)
    })
    system.time(fib2(35))
    ##    user  system elapsed 
    ##       0       0       0
    

    Source: Wickham, H.: Advanced R, p. 238.

    In general memoization in computer science means that you save the results of a function so that when you call it again with the same arguments it returns the saved value.

    0 讨论(0)
提交回复
热议问题