I am doing a performance profiling in Cortex-A78 Linux with perf, and I got following output, which confuses me.
Here is the data,
18,312,265 ll_cache_miss_rd # 0.507 M/sec (28.64%) 36,006,163 l3_cache_refill # 0.996 M/sec (42.68%)
A78's TRM describes the followings
LL_CACHE_MISS_RD Last level cache miss, read.• If CPUECTLR.EXTLLC is set: This event counts any cacheable read transaction which returns a data source of 'DRAM', 'remote' or 'inter-cluster peer'.• If CPUECTLR.EXTLLC is not set: This event is a duplicate of the L*D_CACHE_REFILL_RD event corresponding to the last level of cache implemented – L3D_CACHE_REFILL_RD if both per-core L2 and cluster L3 are implemented, L2D_CACHE_REFILL_RD if only one is implemented, or L1D_CACHE_REFILL_RD if neither is implemented.
In my Cortex-A78 system, L3 is the last level, and the CPUECTLR.EXTLLC is 0, so ll_cache_miss_rd is a duplicate of L3D_CACHE_REFILL_RD, according to the TRM.
ll_cache_miss_rd is a duplicate of L3D_CACHE_REFILL_RD,
So I think these 2 counters should have similar number, but they are NOT, the refill event is double number of miss_rd event!
My questions are:
1. What is the real meaning of each event in Cortex-A78?
2. To count memory reading event, which event should be used?
Thanks,
-Tao