Create quick/reliable benchmark with java?

前端 未结 3 1735
花落未央
花落未央 2020-11-30 12:18

I\'m trying to create a benchmark test with java. Currently I have the following simple method:

public static long runTest(int times){
    long start = Syste         


        
相关标签:
3条回答
  • 2020-11-30 12:31

    In short:

    (Micro-)benchmarking is very complex, so use a tool like the Benchmarking framework http://www.ellipticgroup.com/misc/projectLibrary.zip - and still be skeptical about the results ("Put micro-trust in a micro-benchmark", Dr. Cliff Click).

    In detail:

    There are a lot of factors that can strongly influence the results:

    • The accuracy and precision of System.nanoTime: it is in the worst case as bad as of System.currentTimeMillis.
    • code warmup and class loading
    • mixed mode: JVMs JIT compile (see Edwin Buck's answer) only after a code block is called sufficiently often (1500 or 1000 times)
    • dynamic optimizations: deoptimization, on-stack replacement, dead code elimination (you should use the result you computed in your loop, e.g. print it)
    • resource reclamation: garbace collection (see Michael Borgwardt's answer) and object finalization
    • caching: I/O and CPU
    • your operating system on the whole: screen saver, power management, other processes (indexer, virus scan, ...)

    Brent Boyer's article "Robust Java benchmarking, Part 1: Issues" ( http://www.ibm.com/developerworks/java/library/j-benchmark1/index.html) is a good description of all those issues and whether/what you can do against them (e.g. use JVM options or call ProcessIdleTask beforehand).

    You won't be able to eliminate all these factors, so doing statistics is a good idea. But:

    • instead of computing the difference between the max and min, you should put in the effort to compute the standard deviation (the results {1, 1000 times 2, 3} is different from {501 times 1, 501 times 3}).
    • The reliability is taken into account by producing confidence intervals (e.g. via bootstrapping).

    The above mentioned Benchmark framework ( http://www.ellipticgroup.com/misc/projectLibrary.zip) uses these techniques. You can read about them in Brent Boyer's article "Robust Java benchmarking, Part 2: Statistics and solutions" ( https://www.ibm.com/developerworks/java/library/j-benchmark2/).

    0 讨论(0)
  • 2020-11-30 12:36

    In the 10 million times run, odds are good the HotSpot compiler detected a "heavily used" piece of code and compiled it into machine native code.

    JVM bytecode is interpreted, which leads it susceptible to more interrupts from other background processes occurring in the JVM (like garbage collection).

    Generally speaking, these kinds of benchmarks are rife with assumptions that don't hold. You cannot believe that a micro benchmark really proves what it set out to prove without a lot of evidence proving that the initial measurement (time) isn't actually measuring your task and possibly some other background tasks. If you don't attempt to control for background tasks, then the measurement is much less useful.

    0 讨论(0)
  • 2020-11-30 12:37

    Your code ends up testing mainly garbage collection performance because appending to a String in a loop ends up creating and immediately discarding a large number of increasingly large String objects.

    This is something that inherently leads to wildly varying measurements and is influenced strongy by multi-thread activity.

    I suggest you do something else in your loop that has more predictable performance, like mathematical calculations.

    0 讨论(0)
提交回复
热议问题