Why is my Scala tail-recursion faster than the while loop?

后端 未结 2 1937
广开言路
广开言路 2021-01-31 03:00

Here are two solutions to exercise 4.9 in Cay Horstmann\'s Scala for the Impatient: \"Write a function lteqgt(values: Array[Int], v: Int) that returns a triple containing the co

2条回答
  •  梦毁少年i
    2021-01-31 03:42

    Test results (after reducing array size to 20000000)

    Under Java 1.6.22 I get 151 and 122 ms for tail-recursion and while-loop respectively.

    Under Java 1.7.0 I get 55 and 101 ms

    So under Java 6 your while-loop is actually faster; both have improved in performance under Java 7, but the tail-recursive version has overtaken the loop.

    Explanation

    The performance difference is due to the fact that in your loop, you conditionally add 1 to the totals, while for recursion you always add either 1 or 0. So they are not equivalent. The equivalent while-loop to your recursive method is:

      def lteqgt2(values:Array[Int], v:Int):(Int, Int, Int) = {
        var lt = 0
        var eq = 0
        var gt = 0
        var pos = 0
        val limit = values.length
        while (pos < limit) {
          gt += (if (values(pos) > v) 1 else 0)
          lt += (if (values(pos) < v) 1 else 0)
          eq += (if (values(pos) == v) 1 else 0)
          pos += 1
        }
        (lt, eq, gt)
      }
    

    and this gives exactly the same execution time as the recursive method (regardless of Java version).

    Discussion

    I'm not an expert on why the Java 7 VM (HotSpot) can optimize this better than your first version, but I'd guess it's because it's taking the same path through the code each time (rather than branching along the if / else if paths), so the bytecode can be inlined more efficiently.

    But remember that this is not the case in Java 6. Why one while-loop outperforms the other is a question of JVM internals. Happily for the Scala programmer, the version produced from idiomatic tail-recursion is the faster one in the latest version of the JVM.

    The difference could also be occurring at the processor level. See this question, which explains how code slows down if it contains unpredictable branching.

提交回复
热议问题