Hardware cache events and perf

♀尐吖头ヾ 提交于 2019-11-26 23:24:28

问题


When I run perf list I see a bunch of Hardware Cache Events, as follows:

$ perf list | grep 'cache event'
  L1-dcache-load-misses                              [Hardware cache event]
  L1-dcache-loads                                    [Hardware cache event]
  L1-dcache-stores                                   [Hardware cache event]
  L1-icache-load-misses                              [Hardware cache event]
  LLC-load-misses                                    [Hardware cache event]
  LLC-loads                                          [Hardware cache event]
  LLC-store-misses                                   [Hardware cache event]
  LLC-stores                                         [Hardware cache event]
  branch-load-misses                                 [Hardware cache event]
  branch-loads                                       [Hardware cache event]
  dTLB-load-misses                                   [Hardware cache event]
  dTLB-loads                                         [Hardware cache event]
  dTLB-store-misses                                  [Hardware cache event]
  dTLB-stores                                        [Hardware cache event]
  iTLB-load-misses                                   [Hardware cache event]
  iTLB-loads                                         [Hardware cache event]
  node-load-misses                                   [Hardware cache event]
  node-loads                                         [Hardware cache event]
  node-store-misses                                  [Hardware cache event]
  node-stores                                        [Hardware cache event]

These events mostly seem to return reasonable values based on tests, but I would like to know how to determine to map these events to hardware performance counter events on my system?

That is, these events are certainly implemented using one or more underlying x86 PMU counters on my Skylake CPU - but how do I know which ones?

You can look in /sys/devices/cpu/events for other hardware events, but not for "Hardware cache events".


回答1:


User @Margaret points towards a reasonable answer in the comments - read the kernel source to see the mapping for the PMU events.

We can check arch/x86/events/intel/core.c for the event definitions. I don't actually know if "core" here refers to the Core architecture, of just that this is the core fine with most definitions - but in any case it's the file you want to look at.

The key part is this section, which defines skl_hw_cache_event_ids:

static __initconst const u64 skl_hw_cache_event_ids
                [PERF_COUNT_HW_CACHE_MAX]
                [PERF_COUNT_HW_CACHE_OP_MAX]
                [PERF_COUNT_HW_CACHE_RESULT_MAX] =
{
 [ C(L1D ) ] = {
    [ C(OP_READ) ] = {
        [ C(RESULT_ACCESS) ] = 0x81d0,  /* MEM_INST_RETIRED.ALL_LOADS */
        [ C(RESULT_MISS)   ] = 0x151,   /* L1D.REPLACEMENT */
    },
    [ C(OP_WRITE) ] = {
        [ C(RESULT_ACCESS) ] = 0x82d0,  /* MEM_INST_RETIRED.ALL_STORES */
        [ C(RESULT_MISS)   ] = 0x0,
    },
    [ C(OP_PREFETCH) ] = {
        [ C(RESULT_ACCESS) ] = 0x0,
        [ C(RESULT_MISS)   ] = 0x0,
    },
},
...

Decoding the nested initializers, you get that the L1D-dcahe-load corresponds to MEM_INST_RETIRED.ALL_LOAD and L1-dcache-load-misses to L1D.REPLACEMENT.

We can double check this with perf:

$ ocperf stat -e mem_inst_retired.all_loads,L1-dcache-loads,l1d.replacement,L1-dcache-load-misses,L1-dcache-loads,mem_load_retired.l1_hit head -c100M /dev/zero > /dev/null

 Performance counter stats for 'head -c100M /dev/zero':

        11,587,793      mem_inst_retired_all_loads                                   
        11,587,793      L1-dcache-loads                                             
            20,233      l1d_replacement                                             
            20,233      L1-dcache-load-misses     #    0.17% of all L1-dcache hits  
        11,587,793      L1-dcache-loads                                             
        11,495,053      mem_load_retired_l1_hit                                     

       0.024322360 seconds time elapsed

The "Hardware Cache" events show exactly the same values as using the underlying PMU events we guessed at by checking the source.



来源:https://stackoverflow.com/questions/52170960/hardware-cache-events-and-perf

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!