Is nested function efficient?

后端 未结 5 1336
暖寄归人
暖寄归人 2021-02-12 21:48

In programming languages like Scala or Lua, we can define nested functions such as

function factorial(n)
  function _fac(n, acc)
    if n == 0 then
      return          


        
相关标签:
5条回答
  • 2021-02-12 22:19

    Yes (or it used to), as evidenced by Lua's effort to reuse function values when execution passes through a function definition multiple times.

    Lua 5.2 Changes

    Equality between function values has changed. Now, a function definition may not create a new value; it may reuse some previous value if there is no observable difference to the new function.

    Since you have coded (assuming Lua) a function assigned to a global or local declared at a higher scope, you could code the short-circuit yourself (presuming no other code sets it to anything other than nil or false):

    function factorial(n)
      _fac = _fac or function (n, acc)
      …
      end
      …
    end
    
    0 讨论(0)
  • 2021-02-12 22:19

    I don't know about lua, but in Scala are very common and used in recursive functions to ensure tail-safe optimization:

    def factorial(i: Int): Int = {
          @tailrec
         def fact(i: Int, accumulator: Int): Int = {
            if (i <= 1)
               accumulator
            else
               fact(i - 1, i * accumulator)
         }
         fact(i, 1)
      }
    

    More info about tail-safe and recursion here

    0 讨论(0)
  • 2021-02-12 22:23

    Let's benchmark it in Lua with/without nested functions.

    Variant 1 (inner function object is created on every call)

    local function factorial1(n)
       local function _fac1(n, acc)
          if n == 0 then
             return acc
          else
             return _fac1(n-1, acc * n)
          end
       end
    
       return _fac1(n, 1)
    end
    

    Variant 2 (functions are not nested)

    local function _fac2(n, acc)
       if n == 0 then
          return acc
       else
          return _fac2(n-1, acc * n)
       end
    end
    
    local function factorial2(n)
       return _fac2(n, 1)
    end
    

    Benchmarking code (calculate 12! 10 mln times and display used CPU time in seconds):

    local N = 1e7
    
    local start_time = os.clock()
    for j = 1, N do
       factorial1(12)
    end
    print("CPU time of factorial1 = ", os.clock() - start_time)
    
    local start_time = os.clock()
    for j = 1, N do
       factorial2(12)
    end
    print("CPU time of factorial2 = ", os.clock() - start_time)
    

    Output for Lua 5.3 (interpreter)

    CPU time of factorial1 = 8.237
    CPU time of factorial2 = 6.074
    

    Output for LuaJIT (JIT-compiler)

    CPU time of factorial1 = 1.493
    CPU time of factorial2 = 0.141
    
    0 讨论(0)
  • 2021-02-12 22:35

    Does this approach cause any inefficiency because the nested function instance is defined, or created, everytime we invoke the outer function?

    Efficiency is a large and broad topic. I am assuming that by "inefficient" you mean "does calling the method recursively each time have an overhead"?

    I can only speak on Scala's behalf, specifically the flavor targeting the JVM, as other flavors may act differently.

    We can divide this question into two, depending on what you really meant.

    Nested (local scope) methods in Scala are a lexical scope feature, meaning they give you the accessibility to outer method values, but once we emit the bytecode, they are defined at the class level, just as a plain java method.

    For completeness, do know that Scala also has function values, which are first class citizens, meaning you can pass them around to other functions, then these would incur an allocation overhead, since they are implemented using classes.

    Factorial can be written in a tail recursive manner, as you wrote it in your example. The Scala compiler is intelligent enough such that it will notice your method is tail recursive and turn it into an iterative loop, avoiding the method invocation for each iteration. It may also, if found possible, attempt to inline the factorial method, avoiding the overhead of an additional stack frame allocation.

    For example, consider the following factorial implementation in Scala:

    def factorial(num: Int): Int = {
      @tailrec
      def fact(num: Int, acc: Int): Int = num match {
        case 0 => acc
        case n => fact(n - 1, acc * n)
      }
    
      fact(num, 1)
    }
    

    On the face of it, we have a recursive method. Let's see what the JVM bytecode looks like:

    private final int fact$1(int, int);
      Code:
       0: iload_1
       1: istore        4
       3: iload         4
       5: tableswitch   { // 0 to 0
                     0: 24
               default: 28
          }
      24: iload_2
      25: goto          41
      28: iload         4
      30: iconst_1
      31: isub
      32: iload_2
      33: iload         4
      35: imul
      36: istore_2
      37: istore_1
      38: goto          0
      41: ireturn
    

    What we see here is that the recursion turned into an iterative loop (a tableswitch + a jump instruction).

    Regarding the method instance creation, if our method was not tail recursive, the JVM runtime would need to interpret it for each invocation, until the C2 compiler finds it sufficient such that it will JIT compile it and re-use the machine code for each method call afterwards.

    Generally, I would say this shouldn't worry you unless you've noticed the method is on the execution of your hot path and profiling the code led you to ask this question.

    To conclude, efficiency is a very delicate, use case specific topic. I think we don't have enough information to tell you, from the simplified example you've provided, if this is the best option to choose for your use case. I say again, if this isn't something that showed up on your profiler, don't worry about this.

    0 讨论(0)
  • 2021-02-12 22:40

    The answer depends on the language of course.

    What happens in Scala in particular is that inner functions are compiled as they were standing outside of the scope of the function within which they are defined.

    In this way the language only allows you to invoke them from the lexical scope where they are defined in, but does not actually instantiate the function multiple times.

    We can easily test this by compiling two variants of the

    The first one is a fairly faithful port of your Lua code:

    class Function1 {
    
      def factorial(n: Int): Int = {
        def _fac(n: Int, acc: Int): Int =
          if (n == 0)
            acc
          else
            _fac(n-1, acc * n)
    
        _fac(n, 1)
      }
    
    }
    

    The second one is more or less the same, but the tail recursive function is defined outside of the scope of factorial:

    class Function2 {
    
      def factorial(n: Int): Int = _fac(n, 1)
    
      private final def _fac(n: Int, acc: Int): Int =
        if (n == 0)
          acc
        else
          _fac(n-1, acc * n)
    
    }
    

    We can now compile these two classes with scalac and then use javap to have a look at the compiler output:

    javap -p Function*.scala
    

    which will yield the following output

    Compiled from "Function1.scala"
    public class Function1 {
      public int factorial(int);
      private final int _fac$1(int, int);
      public Function1();
    }
    Compiled from "Function2.scala"
    public class Function2 {
      public int factorial(int);
      private final int _fac(int, int);
      public Function2();
    }
    

    I added the private final keywords to minimize the difference between the two, but the main thing to notice is that in both cases the definitions appear at the class level, with inner functions automatically defined as private and final and with a small decoration to ensure no name class (e.g. if you define a loop inner function inside two different ones).

    Not sure about Lua or other languages, but I can expect at least most compiled languages to adopt a similar approach.

    0 讨论(0)
提交回复
热议问题