We are running a survey to help us improve the experience for all of our members. If you see the survey appear, please take the time to tell us about your experience if you can.
I'm measuring worst case execution time of an application. I would like to flush L1, L2 (Instruction and Data) cache and then begin my measurements.
Is it doable from user mode?
Processor: ARM Cortex A9
OS: Linaro Linux
For the L1 caches, no. The ARMv7-A/R Architecture Reference Manual says:
B4.2.1 Cache and branch predictor maintenance operations, VMSA This section describes the cache and branch predictor maintenance operations. These are: • 32-bit write-only operations • can be executed only by software executing at PL1 or higher.
B4.2.1 Cache and branch predictor maintenance operations, VMSA
This section describes the cache and branch predictor maintenance operations. These are:
• 32-bit write-only operations
• can be executed only by software executing at PL1 or higher.
There might be a system call to request the OS does it for you.
The Cortex-A9 does not have a built-in L2 cache, but it is often paired with a L2C-310. The controls of the L2C-310 are memory mapped, so strictly speaking it's a question of whether they are mapped to an address you have access to. In practise it is very unlikely that the L2 cache's controls would be mapped into user space.
Hello,
Please let me know for my information.
Why do you need to manage caches in the user mode?
I think user applications need not to care of the cache behaviors.
Best regards,
Yasuhiko Koumoto.
Dear Yasuhiko Koumoto,
Difference in execution time are huge during consecutive runs (78ms, 52ms,...).
I believe flushing cache before each run will make it more deterministic.
I can randomly read/write huge data to clear cache contents. But OS may allocate a region of cache space for this operation and thus not clearing all the data in the cache.
Alternatively, measuring maximum value of 'n' runs would make sense to worst case execution time.
I wonder if there is a clean way to bring the execution time more deterministic.
Update:
I found GCC function __builtin___clear_cache(); that flushes only the instruction cache.