Is it reasonable to use Scala's REPL for comparative performance benchmarks?

问题

Scala's REPL is a wonderful playground to interactively test certain pieces of code. Recently, I've been doing some performance comparisons using the REPL to repeatedly execute an operations and comparatively measure wall clock times.

Here's such an example I recently created to help answering an SO question [1][2]:

// Figure out the perfomance difference between direct method invocation and reflection-based method.invoke

def invoke1[T,U](obj:Any, method:Method)(param:T):U = method.invoke(obj,Seq(param.asInstanceOf[java.lang.Object]):_*) match { 
    case x: java.lang.Object if x==null => null.asInstanceOf[U]
    case x => x.asInstanceOf[U]
}

def time[T](b: => T):(T, Long) = {
    val t0 = System.nanoTime()
    val res = b
    val t = System.nanoTime() - t0
    (res,t )
}

class Test {
  def op(l:Long): Long = (2 until math.sqrt(l).toInt).filter(x=>l%x==0).sum
}

val t0 = new Test

val method = classOf[Test].getMethods.find(_.getName=="op").get

def timeDiff = {
  val (timeDirectCall,res) = time { (0 to 1000000).map(x=>t0.op(x)) }
  val (timeInvoke, res2) = time { (0 to 1000000).map(x=>{val res:Long=invoke1(t0,method)(x);res}) }
  (timeInvoke-timeDirectCall).toDouble/timeDirectCall.toDouble
}


//scala> timeDiff
//res60: Double = 2.1428745665357445
//scala> timeDiff
//res61: Double = 2.1604176409796683

In another case I've been generating MM of random data points to compare concurrency models for an open source project. The REPL has been great to play with different configurations without a code-compile-test cycle.

I'm aware of common benchmarks pitfalls such as JIT optimizations and the need for warm-up.

My questions are:

Are there any REPL specific elements to take into account when using it to perform comparative micro of macro benchmarks?
Are these measurements reliable when used relatively to each other? i.e. can they answer the question: is A faster than B ?
Are pre-eliminary executions of the same code a good warm up of the jit compiler?
Any other issues to be aware of?

[1] Scala reflection: How to pass an object's method as parameter to another method

[2] https://gist.github.com/maasg/6808879

回答1:

This is a great question. I can't imagine why anyone downvoted it.

The fact that one of the comments is totally wrong suggests that the REPL needs a place on scala-lang.org's faq or tutorial. I can't find the descriptive paper after a quick search.

The answer is yes, the REPL does what you expect.

Here is an old page on why the question is interesting: the REPL feels dynamic but is really statically compiled. It "straddles two worlds," as the extemporaneous comment on the linked page puts it.

The REPL compiles each line into its own wrapping object. Each such object imports symbols from the history of the interactive session, which is how code magically refers back to previous lines. Everything is compiled, so when it is run, it is run natively on the JVM, so to speak; there is not an extra layer of interpreter. That is the REPL's killer design feature.

That is why the answer to your question is yes, your code runs at the speed of compiled code. Invoking a method does not require recompiling all of history.

Here's another old link showing that other people have had the same question about timing and microbenchmarking.

There is currently an open issue to make it possible to customize how the REPL wraps lines of code. Microbenchmarking is an interesting use case, where code could be wrapped in an arbitrary framework for benchmarking. That will be coming soon.

The benchmark framework should take care of warm-ups. Since each expression submitted to the REPL is compiled separately (albeit by the same compiler), you would notice that a method could be invoked cold the first time and warm the second (modulo inlining by scalac).

Caveat:

Use -Yrepl-class-based or be careful not to put computations in the static initializer of the wrapping object.

Here is some sample confusion and here is the same question, less concealed.

来源：https://stackoverflow.com/questions/19234717/is-it-reasonable-to-use-scalas-repl-for-comparative-performance-benchmarks

标签

scala

performance-testing

read-eval-print-loop