From the TRM of the Cortex A9 we can read in section 7.4.3 Cortex-A9 behavior for Normal Memory Cacheable memory regions:
SCTLR.C=1 The Cortex-A9 Data Cache is enabled. Some Cacheable accesses are still treatedas Non-Cacheable:• all pages marked as Write-Through are treated as Non-Cacheable• if ACTLR.SMP=0, all pages marked as Shared are treated as Non-Cacheable.
SCTLR.C=1 The Cortex-A9 Data Cache is enabled. Some Cacheable accesses are still treated
as Non-Cacheable:
• all pages marked as Write-Through are treated as Non-Cacheable
• if ACTLR.SMP=0, all pages marked as Shared are treated as Non-Cacheable.
I am only running on a single core, and I want to enable L1 and L2 caches without enabling the SCU (I have issues with that at the moment).
I tried to set ACTLR.SMP to 1 with the SCU disabled without any success so far (I get memory corruption), so how should I set my MMU to get L1 and L2 caches on ?
I think I need to map my pages with the S bit set to 0, is that correct ?
Hi vsiles,
Yep, marking memory as Non-Shareable implicitly means that you are the only core/device in the system that needs to see that data. As you progressively increase the Shareability, you're telling the system that it needs to ensure that data is pushed further and further out. ACTLR.SMP=0 implies there is no system cache coherency, so the only way to stay consistent with other devices Sharing data is to treat it as Non-Cacheable.
However I'd rather fix your problems getting the SCU working. Did you invalidate your SCU tags before enabling it? Even for a single core, you should really have the SCU enabled and ACTLR.SMP=1 and ACTLR.FW=1 (which the TRM helpfully forgets to tell you).
The SCU will deal with forwarding coherency requests to other cores or not depending on their presence/power control, and you won't receive or execute any if no other cores are up, but not setting those bits does have a significant effect on the MMU, TLBs and caches on most ARM Cortex-A cores and essentially makes the ACP useless on the Cortex-A9. You should be following the steps in:
ARM Cortex‑A9 MPCore Technical Reference Manual : 5.3.4 About multiprocessor bring-up
.. and set ACTLR.FW at the same time you set ACTLR.SMP.
Ta,
Matt
Ok, so at least I understood this part correctly.
At the moment, here is what I do:
- At boot (only core0 is running), I make sure SCTLR.{M,C,I,Z} are set to 0
- I set a stack to call C code & clear the bss
- I invalidate the SCU (writing 0xffff to the inval register), I set the SCU diag register bit 0 (SCU_BASE + 0x30 as per the errata) to 1 and then I enable the SCU
- I set the ACTLR.{SMP,FW} bits
- I invalidate the whole TLB, L1 I cache, BP cache & L1 D cache
- I configure the PL310 by invalidating it completely and enabling it
- I set the undocumented diag register for imx.6 erratas (bits 4, 6, 11 & 21)
- I set TTBR0,1,TTBCR and then I set the SCTLR.{M, C, I, Z} bits
Up to this point, I have quite a lot of crashes. I added a dsb before the DCISW in my L1 D cache invalidation routine (as per errata 764369) and I have a lot less crashes, I manage to boot quite often, but some crashes are still here.
I'm investigating my L1 Dcache invalidation routine to see if I do everything correctly, that might be the problem
Thank you for the explanations !
mwsealey I noticed that I don't have any issue running code in svc mode, but I have a lot of errors that seems L1 cache related into my user mode code.
To test this, instead of switching back to a user application at the exit of my kernel, I jumped back to the usual "svc" entry point, read/fill a 2 Mb array with random stuff and loop forever.
This code runs perfectly.
Do you have any idea or suggestions to think I might focus on that would explain why I get cache errors in kernel mode but not in user mode ?
My kernel is mapped as Section, Normal Memory WB WA, Shared, Global
My apps are mapped as 4k Page Tables, Normal Memory, WB WA Shared, Non Global
Best regards,
Vincent
I replaced my TLBIALLIS by ASID with plain TLBIALLIS and it works... I'll try to understand what's happening with my ASID management.
I'm not sure it really matters, on Cortex-A9 at least, since the caches are disabled but is there any reason you don't invalidate the CPU caches ASAP instead of leaving it to after you enable coherency management? On more modern cores, cache hits can still happen even if the caches are off, although most of those modern cores will be automatically invalidating their caches on 'cold' reset.
For the boot-time invalidation you shouldn't be using any IS-suffix cache maintenance. Apart from that I'm not sure what could be going on, it's a little hard to guess at the complexity of the issue if all we have is information that "it crashes and I see memory corruption". You mention next that you replaced your TLBIALLIS by ASID with TLBIALLIS - there are few reasons to invalidate the entire TLB or evict entire applications worth of translations at most points in kernel code - application exit is one, although you can usually rely on the entries being naturally evicted. Another is ASID re-use. Is it possible that you're not following the sequence defined for break-before-make or properly synchronizing your TTBR/CONTEXTIDR swap, depending on which version you're using (there are Cortex-A9 errata on it with the exact sequences..)?