Why does Linux perf use event l1d.replacement for “L1 dcache misses” on x86?

余生颓废 提交于 2019-12-23 09:21:47

问题


On Intel x86, Linux uses the event l1d.replacements to implement its L1-dcache-load-misses event.

This event is defined as follows:

Counts L1D data line replacements including opportunistic replacements, and replacements that require stall-for-replace or block-for-replace.

Perhaps naively, I would have expected perf to use something like mem_load_retired.l1_miss, which supports PEBS and is defined as:

Counts retired load instructions with at least one uop that missed in the L1 cache. (Supports PEBS)

The event values are usually not exactly very close, and sometimes they vary wildly. For example:

$ocperf stat -e mem_inst_retired.all_loads,l1d.replacement,mem_load_retired.l1_hit,mem_load_retired.l1_miss,mem_load_retired_fb_hit head -c100M /dev/urandom > /dev/null 

 Performance counter stats for 'head -c100M /dev/urandom':

       445,662,315      mem_inst_retired_all_loads                                   
            92,968      l1d_replacement                                             
       443,864,439      mem_load_retired_l1_hit                                     
         1,694,671      mem_load_retired_l1_miss                                    
            28,080      mem_load_retired_fb_hit                                     

There are more than 17 times more "L1 misses" as measured by mem_load_retired.l1_miss as compared to l1d.replacement. Conversely, you can also find examples where l1d.replacement is much higher than the mem_load_retired counters.

What exactly is l1d.replacement measuring, why was it chosen in the kernel, and is it a better proxy for L1 d-cache misses than mem_load_retired.l1_miss?

来源:https://stackoverflow.com/questions/52173478/why-does-linux-perf-use-event-l1d-replacement-for-l1-dcache-misses-on-x86

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!