How to catch the L3-cache hits and misses by perf tool in Linux
Solution 1
That is strange LLC (Last Level Cache) is configured with "L2" if the hardware has L3 cache. But I don't know yet internals of perf and maybe these settings are generic.
I think the only solution you have is to use "raw hardware event" (see at the end of "perf list", the line starting with "rNNN"). That gives the opportunity to encode a description of the hardware registers.
The perf user guide and tutorial only mention "To measure an actual PMU as provided by the HW vendor documentation, pass the hexadecimal parameter code". I don't know what is the syntax on Intel and if there is different implementations of the performance monitor on this architecture. You could start here:
http://code.google.com/p/kernel/wiki/PerfUserGuide#Hardware_events
Solution 2
I have had more success using raw event counters, looking directly at the Intel Software Developer Manual for detailed definitions.
From section: 18.2.1.2 Pre-defined Architectural Performance Events
r412e "LLC Misses" is likely the one you want
perf stat -e r412e <command>
(Note that for me, this gives the same number as using -e cache-misses.)
Solution 3
To get system-wide L3 cache miss rate, just do:
$ sudo perf stat -a -e LLC-loads,LLC-load-misses,LLC-stores,LLC-store-misses,LLC-prefetch-misses sleep 5
Performance counter stats for 'system wide':
24,477,266,369 LLC-loads (22.65%)
1,409,470,007 LLC-load-misses # 5.76% of all LL-cache hits (29.79%)
88,584,705 LLC-stores (30.32%)
10,545,277 LLC-store-misses (30.03%)
150,785,745 LLC-prefetch-misses (34.71%)
13.773144159 seconds time elapsed
This prints out both misses and total references. The ratio is the L3 cache miss rate.
See complete event list on wiki: https://perf.wiki.kernel.org/index.php/Tutorial#Events
Admin
Updated on June 05, 2022Comments
-
Admin almost 2 years
Is there any way to catch the L3-cache hits and misses by perf tool in Linux. According to the output of
perf list cache
, L1 and LLC cache are supported. According to the definition of perf_evsel__hw_cache array in perf's source code:const char *perf_evsel__hw_cache[PERF_COUNT_HW_CACHE_MAX] [PERF_EVSEL__MAX_ALIASES] = { { "L1-dcache", "l1-d", "l1d", "L1-data", }, { "L1-icache", "l1-i", "l1i", "L1-instruction", }, { "LLC", "L2", }, { "dTLB", "d-tlb", "Data-TLB", }, { "iTLB", "i-tlb", "Instruction-TLB", }, { "branch", "branches", "bpu", "btb", "bpc", }, { "node", }, };
LLC is an alias to L2-cache. My question is how to catch the L3-cache hits and misses by perf tool in Linux. Thanks in advance!
-
osgx almost 10 yearsAnd page bnikolic.co.uk/blog/hpc-prof-events.html have advises of searching and using raw perf events with help of libpfm4 (perfmon2) utilities
showevtinfo
andcheck_events
-
Zheng Shao over 6 yearsTo get system-wide L3 cache miss rate, just do:
sudo perf stat -a -e LLC-loads -e LLC-load-misses -e LLC-stores -e LLC-store-misses -e LLC-prefetch-misses
which prints out both misses and total references. The ratio is the L3 cache miss rate. -
blaze9 over 3 yearsMay i ask what are the LLC-prefetch-misses and how they should be used to the calculation of L3 cache miss rate? So far, my calculation is simply (LLC-load-misses+LLC-store-misses) / (LLC-loads+LLC-stores)