When busy-spining Java thread is bound to physical core, can context switch happen by the reason that new branch in code is reached?

后端 未结 1 764
北恋
北恋 2021-02-07 06:01

I am interested in low-latency code and that`s why I tried to configure thread affinity. In particular, it was supposed to help to avoid context switches.

I have configur

1条回答
  •  北海茫月
    2021-02-07 06:51

    A voluntary context switch usually means a thread is waiting for something, e.g. for a lock to become free.

    async-profiler can help to find where context switches happen. Here is a command line I used:

    ./profiler.sh -d 80 -e context-switches -i 2 -t -f switches.svg -I 'main*' -X 'exit_to_usermode_loop*' PID
    

    Let's go through it in detail:

    • -d 80 run profiler for at most 80 seconds.
    • -e context-switches an event to profile.
    • -i 2 interval = 2 events. I profile every second context-switch, since the profiling signal itself causes a context switch, and I don't want to fall into recursion.
    • -t split the profile by threads.
    • -f switches.svg output file name; svg extension automatically selects Flame Graph format.
    • -I 'main*' include only the main thread in the output.
    • -X 'exit_to_usermode_loop*' exclude events related to nonvoluntary context switches.
    • PID Java process ID to profile.

    The results may differ from one run to another. Typically I see from 0 to 3 context switches on each graph.

    Here are the most common places where a context switch happens. They are indeed related to waiting on a mutex.

    1. ThreadSafepointState::handle_polling_page_exception() called from TestLoop.main. This means, a thread has been stopped at a safepoint requested by another thread. To investigate a reason of a safepoint, add -Xlog:safepoint* JVM option.
    [75.889s][info][safepoint        ] Application time: 74.0071000 seconds
    [75.889s][info][safepoint        ] Entering safepoint region: Cleanup
    [75.889s][info][safepoint,cleanup] deflating idle monitors, 0.0000003 secs
    [75.889s][info][safepoint,cleanup] updating inline caches, 0.0000058 secs
    [75.890s][info][safepoint,cleanup] compilation policy safepoint handler, 0.0000004 secs
    [75.890s][info][safepoint,cleanup] purging class loader data graph, 0.0000001 secs
    [75.890s][info][safepoint,cleanup] resizing system dictionaries, 0.0000009 secs
    [75.890s][info][safepoint,cleanup] safepoint cleanup tasks, 0.0001440 secs
    [75.890s][info][safepoint        ] Leaving safepoint region
    

    Right, a Cleanup safepoint happens shortly after 74 seconds (exactly the specified delay). The purpose of a Cleanup safepoint is to run periodic tasks; in this case - to update inline caches. If there is cleanup work to do, a safepoint may happen every GuaranteedSafepointInterval milliseconds (1000 by default). You can disable periodic safepoints by setting -XX:GuaranteedSafepointInterval=0, but this may have performance implications.

    1. SharedRuntime::handle_wrong_method() from TimeUtils.now. This happens when a call site in the compiled code has been made non-entrant. As this is related to JIT compilation, add -XX:+PrintCompilation option.
      75032 1430 %     4       main.TestLoop::main @ 149 (245 bytes)   made not entrant
      75033 1433 %     3       main.TestLoop::main @ 149 (245 bytes)
      75033 1434       4       util.RealtimeNanoClock::nanoTime (8 bytes)
      75034 1431       3       util.RealtimeNanoClock::nanoTime (8 bytes)   made not entrant
      75039 1435 %     4       main.TestLoop::main @ 149 (245 bytes)
      75043 1433 %     3       main.TestLoop::main @ 149 (245 bytes)   made not entrant
    

    Yes, both TestLoop.main and RealtimeNanoClock.nanoTime were recompiled 75 seconds after JVM start. To find out the reason, add -XX:+UnlockDiagnosticVMOptions -XX:+LogCompilation.

    This will produce a large compilation log, where we'll look for an event happened at 75th second.

    
    
    

    That was an uncommon trap due to unstable_if at bytecode index 161. In other words, when main was JIT compiled, HotSpot did not produce code for the else branch, because it was never executed before (such a speculative dead code elimination). However, to retain correctness of the compiled code, HotSpot places a trap to deoptimize and fall back to the interpreter, if the speculative condition fails. This is exactly what happens in your case when if condition becomes false.

    1. Runtime1::counter_overflow(). This is again related to recompilation. After running C1 compiled code for some time, HotSpot discovers that the code is hot, and decides to recompile it with C2.

      In this case I caught a contended lock on the compiler queue.

    Conclusion

    HotSpot JIT compilers heavily rely on speculative optimizations. When a speculative condition fails, this leads to deoptimization. Deoptimization is indeed very bad for low latency applications: besides switching to slow execution in the interpreter, this may indirectly cause undesired pauses due to acquiring locks in the JVM runtime, or bringing the JVM to a safepoint.

    Common reasons for deoptimization are unstable_if and class_check. If you want to avoid deoptimization on a latency critical path, make sure to "warm-up" all code paths and all possible receivers for virtual methods.

    0 讨论(0)
提交回复
热议问题