ARM Cortex R52+ Data Cache Misses mis-calculation

Hi,

I have a question regarding the Cortex-R52+, more specifically around the data caches behaviour.

The setup is the following:

  • Cache size: 8KB
  • Cache type: 4-way set associative
  • Cache segregation:
    • 2 ways assigned to AXI-M
    • 2 ways assigned to FLASH

I want to track the number of cache misses generated and to do so I am using the following event counters:

  • L1D_CACHE_REFILL counter (0x3): number of data cache refills
  • L1D_CACHE_ACCESS counter (0x4): number of data cache accesses
  • LD_RETIRED counter (0x6): number of loads executed
  • ST_RETIRED counter (0x7): number of stores executed

The test I am performing is quite simple, it is a matrix multiply that should cause ~260K data cache misses due to the sparse accesses to the matrix B that should always cause a refill. You may find the code below:

What I am observing though is that there are a total of ~130K data cache misses, which does not add up in the current cache settings.

Here the output from the test on hardware:

Thus, the question on whether the cache is implementing a different eviction algorithm instead of the LRU, which causes a lot less number of misses than the ones expected.

Do you have any insight on the matter and on which eviction algorithm may be implemented in this case?

Thanks and best regards,

Alessandro

0