why does perf stat show “stalled-cycles-backend” as ?

后端 未结 3 854
渐次进展
渐次进展 2021-01-31 04:11

Running perf stat ls shows this:

Performance counter stats for \'ls\':

          1.388670 task-clock                #    0.067 CPUs utilized                


        
3条回答
  •  一生所求
    2021-01-31 04:53

    The perf (or its in-kernel part) was not updated to support your CPU, so perf is unable to map generic event name "stalled-cycles-backend" to actual HW event.

    In such case it can be easier to find event names; e.g. for Intel CPUs - from Intel's optimization manual http://www.intel.com/content/dam/doc/manual/64-ia-32-architectures-optimization-manual.pdf (which groups events by type and explains how to use them to measure various parts). Don't have similar document for AMD.

    To use event names with perf without manual conversion into raw event ids (like amdn says in his answer), you can use converter scripts showevtinfo and check_events from perfmon2 (libpfm4; examples folder), as explained in the article "How to monitor the full range of CPU performance events" by Bojan Nikolic http://www.bnikolic.co.uk/blog/hpc-prof-events.html. perfmon2 knows AMD and Intel CPUs, and written in C/C++

    For Intel CPUs the easiest way is to use ocperf wrapper over perf from Intel's open source python project by Andi Kleen "pmu-tools" hosted at github https://github.com/andikleen/pmu-tools and introduced here in ML: https://lwn.net/Articles/556983/ and in Andi's blog http://halobates.de/blog/p/245

    The ocperf understands all intel event names from Intel's optimization manual.

    ocperf will also support every HW event with older linux kernels. It has its own database in tsv or json format with all HW events and their codes at https://download.01.org/perfmon/ (there is auto-downloader in pmu-tools), and the database is constantly updated by Intel's employers. Format of database is documented in readme: https://download.01.org/perfmon/readme.txt

    For Sandy Bridge/Ivy Bridge or Haswell, and kernels 3.10 or newer, you can also use toplev.py script from "pmu-tools" to investigate performance. Here is description from its author, Andi Kleen, http://halobates.de/blog/p/262 "pmu-tools, part II: toplev" based on "TopDown" method from Ahmad Yasin "How to Tune Applications Using a Top-Down Characterization of Microarchitectural Issues and "Top Down Analysis. Never lost with performance counters"

提交回复
热议问题