Deterministic code performance using Arm Cortex A9 core

Hi,

I am developing a simple low level motor control application using Arm Cortex A9 core with HPS fabric of Cyclone V SoC FPGA.
I am using Arm Development Studio v2024.0-1 tool set to build the code.

Does Cortex A9 cache controller supports to load and execute a critical piece of code from one of the 4 WAY of internal 32kB Instruction cache to achieve deterministic code performance ?
Unfortunately running the code from on-chip ram violates our timing requirements due to latency caused by system bus matrix and slow memory access.

I am aware that Cortex Mx core cache controller supports loading/executing the critical code in one of the cache WAY and locking it so that it always get cache hit condition.
I couldn't find similar feature described in Arm Cortex A9 technical reference manual.

Best Regards,
Naresh

  

  • I'm not aware of a way to do this with the integrated L1 caches.  If there's an L2 cache, it's external to the core and it might have something like this.

    Note, it's not just the data and instruction caches that can be sources of indeterminate timings.  You'll also potentially need to consider the TLBs and branch prediction, depending on what level of deterministic timing you're after.

  • It is mentioned in Arm architecture manual that L1 cache doesn't support lockdown.
    Also the TLB lockdown feature is just there for backward compatibility and not fully functional in multi core cluster.

    I have learned that Cortex A9 is not designed for deterministic real time performance.
    We have decided to make peace with variable latency and stall cycles until we switch to any real time processor.