PMU for multi threaded environment

前端 未结 2 1614
囚心锁ツ
囚心锁ツ 2021-01-28 04:35

I am planning to measure PMU counters for L1,L2,L3 misses branch prediction misses , I have read related Intel documents but i am unsure about the below scenarios.could some one

相关标签:
2条回答
  • 2021-01-28 04:55

    Summary of the Intel forum thread started by the OP:

    • The Linux perf subsystem virtualizes the performance counters, but this means you have to read them with a system call, instead of rdpmc, to get the full virtualized 64-bit value instead of whatever is currently in the architectural performance counter register.

    • If you want to use rdpmc inside your own code so it can measure itself, pin each thread to a core because context switches don't save/restore PMCs. There's no easy way to avoid measuring everything that happens on the core, including interrupt handlers and other processes that get a timeslice. This can be a good thing, since you need to take the impact of kernel overhead into account.


    More useful quotes from John D. McCalpin, PhD ("Dr. Bandwidth"):

    For inline code instrumentation you should be able to use the "perf events" API, but the documentation is minimal. Some resources are available at http://web.eece.maine.edu/~vweaver/projects/perf_events/faq.html

    You can use "pread()" on the /dev/cpu/*/msr device files to read the MSRs -- this may be a bit easier to read than IOCTL-based code. The codes "rdmsr.c" and "wrmsr.c" from "msr-tools-1.3" provide excellent examples.

    There have been a number of approaches to reserving and sharing performance counters, including both software-only and combined hardware+software approaches, but at this point there is not a "standard" approach. (It looks like Intel has a hardware-based approach using MSR 0x392 IA32_PERF_GLOBAL_INUSE, but I don't know what platforms support it.)


    your questions

    what will happen if my process is scheduled out when my_program() is running, and scheduled to another core?

    You'll see random garbage, same if another process resets PMCs between timeslices of your process.

    0 讨论(0)
  • 2021-01-28 05:06

    i got the answers from some Intel forum, the link is below.

    https://software.intel.com/en-us/forums/intel-moderncode-for-parallel-architectures/topic/673602

    0 讨论(0)
提交回复
热议问题