Hi, I'm new to community.
I am recently working on cache performance evaluation of a software on arm ( which I did not know much about before) and aiming to record all the instructions causing a data cache miss.
Currently, my way is straightforward: I configure the PMUIRQ as FIQ and set counter to -1 initially, and every time a overflow FIQ occurs the handler will first disable cache, disable counter, push lr to a stack, reset counter to -1, enable counter, enable cache again and return to lr - 4.
But from the lr I record, I found many of the cache-miss instructions( which are lr - 8 ) do not access data in memory (like branch, cmp, etc).
I want to ask:
1. Which instructions are possible to cause a cache miss ?
2. Is that possible that the FIQ request is delayed ?
3. Is there any better idea to record the cache-miss instructions?
Thank you!
But this can only deal with instruction cache miss I guess
Why you think the approach will not work for data cache misses? It should work for any performance counter.
Oh, I did not make myself clear. Basically, I want to rearrange our data structures to reduce the data cache miss ( and restructure the code to reduce the instruction miss). So I think I need to know which data (or which data area) is accessed rather than which function causes data miss ( admittedly, the function might narrow down the range ).