We are running a survey to help us improve the experience for all of our members. If you see the survey appear, please take the time to tell us about your experience if you can.
I am doing a performance profiling in Cortex-A78 Linux with perf, and I got following output, which confuses me.
Here is the data,
18,312,265 ll_cache_miss_rd # 0.507 M/sec (28.64%) 36,006,163 l3_cache_refill # 0.996 M/sec (42.68%)
A78's TRM describes the followings
LL_CACHE_MISS_RD Last level cache miss, read.• If CPUECTLR.EXTLLC is set: This event counts any cacheable read transaction which returns a data source of 'DRAM', 'remote' or 'inter-cluster peer'.• If CPUECTLR.EXTLLC is not set: This event is a duplicate of the L*D_CACHE_REFILL_RD event corresponding to the last level of cache implemented – L3D_CACHE_REFILL_RD if both per-core L2 and cluster L3 are implemented, L2D_CACHE_REFILL_RD if only one is implemented, or L1D_CACHE_REFILL_RD if neither is implemented.
In my Cortex-A78 system, L3 is the last level, and the CPUECTLR.EXTLLC is 0, so ll_cache_miss_rd is a duplicate of L3D_CACHE_REFILL_RD, according to the TRM.
ll_cache_miss_rd is a duplicate of L3D_CACHE_REFILL_RD,
So I think these 2 counters should have similar number, but they are NOT, the refill event is double number of miss_rd event!
My questions are:
1. What is the real meaning of each event in Cortex-A78?
2. To count memory reading event, which event should be used?
Thanks,
-Tao