PMU for multi threaded environment

前端未结

关注

 2  1612

囚心锁ツ 2021-01-28 04:35

I am planning to measure PMU counters for L1,L2,L3 misses branch prediction misses , I have read related Intel documents but i am unsure about the below scenarios.could some one

2条回答

情歌与酒 (楼主)

2021-01-28 04:55
Summary of the Intel forum thread started by the OP:
- The Linux perf subsystem virtualizes the performance counters, but this means you have to read them with a system call, instead of rdpmc, to get the full virtualized 64-bit value instead of whatever is currently in the architectural performance counter register.
- If you want to use rdpmc inside your own code so it can measure itself, pin each thread to a core because context switches don't save/restore PMCs. There's no easy way to avoid measuring everything that happens on the core, including interrupt handlers and other processes that get a timeslice. This can be a good thing, since you need to take the impact of kernel overhead into account.
More useful quotes from John D. McCalpin, PhD ("Dr. Bandwidth"):

For inline code instrumentation you should be able to use the "perf events" API, but the documentation is minimal. Some resources are available at http://web.eece.maine.edu/~vweaver/projects/perf_events/faq.html

You can use "pread()" on the /dev/cpu/*/msr device files to read the MSRs -- this may be a bit easier to read than IOCTL-based code. The codes "rdmsr.c" and "wrmsr.c" from "msr-tools-1.3" provide excellent examples.

There have been a number of approaches to reserving and sharing performance counters, including both software-only and combined hardware+software approaches, but at this point there is not a "standard" approach. (It looks like Intel has a hardware-based approach using MSR 0x392 IA32_PERF_GLOBAL_INUSE, but I don't know what platforms support it.)

your questions

what will happen if my process is scheduled out when my_program() is running, and scheduled to another core?

You'll see random garbage, same if another process resets PMCs between timeslices of your process.
0 讨论(0)

查看其它2个回答
发布评论:

提交评论
- 加载中...