I encountered a problem about replacing the active TTBR0_EL2 register. It seems that the new data loading does not use the new page table, or the instruction execution is messed up. In short, Data Abort is triggered, and ISS indicates that the error details are at translation level 1.
I compared the process of modifying the page table base address in the Linux kernel and found that the Linux kernel added an `isb` instruction between modifying ttbr and TLB refresh. So I added it and found that everything worked fine. I read the ARM manual and some online information, but I didn't figure out why.
isb`
dsb`
Configuration: MMU on, dcache off, icache off
```asm; linux/arch/arm64/mm/proc.S .macro __idmap_cpu_set_reserved_ttbr1, tmp1, tmp2 adrp \tmp1, reserved_pg_dir phys_to_ttbr \tmp2, \tmp1 offset_ttbr1 \tmp2, \tmp1 msr ttbr1_el1, \tmp2 isb tlbi vmalle1 dsb nsh isb .endm```
frank said:why memory barrier is needed?
I think this is section of the Arm ARM you need:
B2.10.1 Instruction Synchronization Barrier (ISB)An ISB instruction ensures that all instructions that come after the ISB instruction in program order are fetched fromthe cache or memory after the ISB instruction has completed. Using an ISB ensures that the effects ofcontext-changing operations executed before the ISB are visible to the instructions fetched after the ISB instruction.Examples of context-changing operations that require the insertion of an ISB instruction to ensure the effects of theoperation are visible to instructions fetched after the ISB instruction are: Completed cache and TLB maintenance instructions. Changes to System registers
B2.10.1 Instruction Synchronization Barrier (ISB)An ISB instruction ensures that all instructions that come after the ISB instruction in program order are fetched fromthe cache or memory after the ISB instruction has completed. Using an ISB ensures that the effects ofcontext-changing operations executed before the ISB are visible to the instructions fetched after the ISB instruction.Examples of context-changing operations that require the insertion of an ISB instruction to ensure the effects of theoperation are visible to instructions fetched after the ISB instruction are:
A change to a TTBR is a change to the PE's context. A change to context is only guaranteed to be visible after a context synchronization event , which is either an ISB or exception entry/exit (note: see FEAT_ExS). Meaning in your case, without the ISB the PE might still be using the "old" context at the time of TLBI. Leading it to attempt to re-walk the tables using the old configuration.
frank said:if yes, why can't use `dsb` replaced?
Because ISBs and DSBs do different things. Sometimes you need one, sometimes the other, sometimes a combination - it all depends on what you are trying to achieve. If you had one "super" barrier that did everything that would end up being inefficient as often you'd be over ordering.
msr ttbr1_el1, \tmp2 isb <-- ensure change in context (write to TTBR) is visible to following instructions tlbi vmalle1 dsb nsh <-- ensure TLBI completes before processing next instruction isb <-- ensure instructions beyond this point are fetched using the post-TBLI translations
Thanks for reply.
"without the ISB the PE might still be using the "old" context at the time of TLBI. Leading it to attempt to re-walk the tables using the old configuration."
Assuming that in a single-core environment, only the following few instructions may use the old page table, the final `isb` must cause context synchronization, and subsequent instructions after this code-block will be normal, right? I also have the final `isb`, but data abort ELR indicates that exception occurred after this code-block, is a `ldr` instruction.
That doesn't quite work.
Let's work through the example with and without the first ISB. First, without:
1 msr ttbr1_el1, \tmp2 2 nop 3 tlbi vmalle1 4 dsb nsh 5 isb6 <some instruction>
The effect of writing TTBR1 is no guaranteed to be visible until the ISB at line (5). The TLBI at line (3) invalidates cached translations. As soon as the TLBI completes, the processor is allowed to speculatively walk the translation tables and cache entries in the TLBs.
Meaning we have a problem. There is a window between lines 3 and 5 where the processor might do speculative table walks and might be using the old TTBR1 value. We don't know that it will, but it might - so we should assume it will.
The ISB at line 5 means the subsequent instructions see the effect of the TTBR update. But that doesn't do anything about the speculative table walks that have already happened, or the already cached TLB entries. Meaning that instructions at line 6 onwards could be using stale translations.
Now lets but the ISB at line 2 back in:
1 msr ttbr1_el1, \tmp2 2 isb 3 tlbi vmalle1 4 dsb nsh 5 isb6 <some instruction>
With an ISB at line 2, the update to TTBR1 at line (1) must be visible before the TLBI at line (3). The MMU might still speculatively refill the TLBs between lines 3 and 5, but it must do so using the new TTBR1 value. Therefore, from line 6 we can guarantee that we only see the new translations.
This solves my confusion. Thanks again.
BTW, do you think break-before-make is necessary for me? (Now I run in EL2 low address, and I try to change TTBR0_EL2 to a whole new page table)
It depends on what you're trying to achieve.
The advantage of break-before-make (BBM) is predictability. TLBs are not permitted to cache invalid translations (ones that result in a Translation Fault). BBM therefore lets you ensure that the old and new translation can't be the TLB at the same time (which would be a bad thing).
But you don't always need that. For example, if you're doing a task switch, the different tasks would be typically be ASID (or ASID+VMID) tagged. It's perfectly valid to have the same translation in the TLBs if they're tagged with different ASIDs. That assumes that you can atomically switch the TTBR and ASID, which in AArch64 you can.