问题
With perf (the Linux profiler), (v4.15.18), I can run perf stat $COMMAND
to get some simple stats on the command. If I run perf record
, it saves lots of data to a perf.data
file.
Can I run perf stat
on the output of perf record
? So that I can look at the perf recorded data, but also get a simple overview?
回答1:
perf stat
uses hardware performance monitoring unit in counting mode, and perf record
/perf report
with perf.data file uses the same unit in overflow mode. In both modes hardware performance counters are configured with control register into some kind of performance events (for example cpu cycles or instructions executed), and counters will be incremented on every event.
In counting mode perf stat
will configure counters as zero at program start, and will read final counter value at program exit (actually counting may be split in several segments with same result - single value for full run).
In profiling mode (sampling profiling) perf record
will configure counter to some negative value, for example -100000
and overflow handler will be installed (actual value will be autotuned into some frequency). Every 100000 events the counter will overflow into zero and generate an interrupt. perf_events
interrupt handler will record the "sample" (current time, pid, instruction pointer, optionally callstack in -g
) into ring buffer which will be saved into perf.data
. This handler will also reset the counter into -100000
again. So, after long enough run there will be thousands of samples to be stored in perf.data
, which can be used to generate statistical profile of program (which parts of program did run more often).
What does perf stat
show? In default mode for x86_64 cpu: running time of the program (task-clock and elapsed), 3 software events (context switch, cpu migration, page fault), 4 hardware counters: cycles, instructions, branches, branch-misses:
$ echo '3^123456%3' | perf stat bc
0
Performance counter stats for 'bc':
325.604672 task-clock (msec) # 0.998 CPUs utilized
0 context-switches # 0.000 K/sec
0 cpu-migrations # 0.000 K/sec
181 page-faults # 0.556 K/sec
828,234,675 cycles # 2.544 GHz
1,840,146,399 instructions # 2.22 insn per cycle
348,965,282 branches # 1071.745 M/sec
15,385,371 branch-misses # 4.41% of all branches
0.326152702 seconds time elapsed
What does record perf record
? In single wake up event (ring buffer overflow) it did save 1246 samples into perf.data, and default hw event was used (cycles)
$ echo '3^123456%3' | perf record bc
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.049 MB perf.data (1293 samples) ]
With perf report --header|less
, perf script
and perf script -D
you can take a look into the perf.data content:
$ perf report --header |grep event
# event : name = cycles:uppp, , size = 112, { sample_period, sample_freq } = 4000, sample_type = IP|TID|TIME|PERIOD ...
# Samples: 1K of event 'cycles:uppp'
$ perf script 2>/dev/null |grep cycles|wc -l
1293
There are some timestamps inside perf.data and some additional events for program start and exit (perf script -D |egrep exec\|EXIT
), but there is no enough information in default perf.data
to fully reconstruct perf stat
output. Running time is recorded only as timestamps of start and exit, and of every event sample, software events are not recorded, only single hardware event was used (cycles; no instructions, branches, branch-misses). Approximation of used hardware counter can be done, but it is not exact (real cycles was around 820-825 mln):
$ perf report --header |grep Event
# Event count (approx.): 836622729
With non-default recording of perf.data
more events can be estimated:
$ echo '3^123456%3' | perf record -e cycles,instructions,branches,branch-misses bc
[ perf record: Captured and wrote 0.238 MB perf.data (5164 samples) ]
$ perf report --header |egrep Event\|Samples
# Samples: 1K of event 'cycles'
# Event count (approx.): 834809036
# Samples: 1K of event 'instructions'
# Event count (approx.): 1834083643
# Samples: 1K of event 'branches'
# Event count (approx.): 347750459
# Samples: 1K of event 'branch-misses'
# Event count (approx.): 15382047
So, you can't run perf stat on perf.data file, but you can ask perf report
to print the header with event count estimation. You also can try to parse timestamps from perf script
/perf script -D
.
回答2:
No you can't. perf record output is a data file. perf stat expects an application.
You can use perf script to run a pre-canned scripts that aggregate and summarize the trace data. Possible scripts can be listed using following command.
perf script -l
Beside limited number of pre-canned script, You can also define custom perf.data processing scripts in python or perl.
See perf script, perf script in python and perf script in perl for details.
来源:https://stackoverflow.com/questions/62550369/run-perf-stat-on-the-output-of-perf-record