This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

TTBR0_EL1, translation fault level 0 after switching identity mapping off

Hello everyone.

I'm creating a single-core kernel for RPi 4B (ARMv8, BCM2711). I've successfully mapped the kernel in high memory, using TTBR1. TTBR0 stays on identity mapping until the load of the user process. For the last week I'm fighting with the following issue: after switching the TTBR0 table from identity to user process, the memory becomes inaccessible, and a translation fault level 0 is triggered at access.

Any clues as to what is happening are greatly appreciated, I've run out of ideas for what to check.

The old mapping works up until the switch, so that leads me to believe that the MMU is set up correctly. After the switch, the memory addresses that are mapped there are inaccessible, even when using GDB via JTAG.

My intuition tells me that I'm missing something about TLB maintenance, but I don't know what. The code that I'm using to invalidate TLB is the following:

        dsb ish
        isb sy
        
        msr ttbr0_el1, x20
        
        ic iallu
        dsb sy
        isb sy
        tlbi vmalle1
        dmb sy
        isb sy


The TTBR adresses are:

       TTBR0:  0x100010000 
       TTBR1:  0xa5000   




The table descriptors for the user process, as seen from GDB, are as follows:

(gdb) x/2gx 0xffff000100010000
0xffff000100010000:     0x0000000100012003      0x0000000000000000
(gdb) x/16gx 0xffff000100012000
0xffff000100012000:     0x0000000100013403      0x0000000000000000
0xffff000100012010:     0x0000000000000000      0x0000000000000000
0xffff000100012020:     0x0000000000000000      0x0000000000000000
0xffff000100012030:     0x0000000000000000      0x0000000000000000
0xffff000100012040:     0x0000000000000000      0x0000000000000000
0xffff000100012050:     0x0000000000000000      0x0000000000000000
0xffff000100012060:     0x0000000100016403      0x0000000000000000
0xffff000100012070:     0x0000000000000000      0x0000000000000000
(gdb) x/2gx 0xffff000100010000
0xffff000100010000:     0x0000000100012003      0x0000000000000000
(gdb) x/14gx 0xffff000100012000
0xffff000100012000:     0x0000000100013403      0x0000000000000000
0xffff000100012010:     0x0000000000000000      0x0000000000000000
0xffff000100012020:     0x0000000000000000      0x0000000000000000
0xffff000100012030:     0x0000000000000000      0x0000000000000000
0xffff000100012040:     0x0000000000000000      0x0000000000000000
0xffff000100012050:     0x0000000000000000      0x0000000000000000
0xffff000100012060:     0x0000000100016403      0x0000000000000000
(gdb) x/2gx 0xffff000100013000
0xffff000100013000:     0x0000000100014403      0x0000000000000000
(gdb) x/2gx 0xffff000100014000
0xffff000100014000:     0x00200001000114c1      0x0000000000000000
(gdb) x/2gx 0xffff000100016000
0xffff000100016000:     0x0000000100017403      0x0000000000000000
(gdb) x/2gx 0xffff000100017000
0xffff000100017000:     0x0060000100015441      0x0000000000000000

Parents
  • Is the fault detected while returning to EL0? i.e., when the CPU is about to return to EL0 following an exception return. The code that switches TTBR0 must be running at EL1 under TTBR1, so it should remain unaffected by the switch.

    Also, why is bit#10 (0x400) set in the table descriptors?

Reply
  • Is the fault detected while returning to EL0? i.e., when the CPU is about to return to EL0 following an exception return. The code that switches TTBR0 must be running at EL1 under TTBR1, so it should remain unaffected by the switch.

    Also, why is bit#10 (0x400) set in the table descriptors?

Children
  • The fault occurs at eret to EL0, but even if I don't return to EL0, the EL1 code cannot read the addresses mapped for that program after a "tlbi vmalle1". PAR_EL1 flags it gives when trying to translate are 0001001 (indicating the same issue as ESR_EL1.ISS for the abort).


    Bit #10 was my mistake, I thought access flag is required for table descriptors. I've since removed it, but the problem persists.

  • Okay.

    Since you already have a working TTBR0 mapping (the identity map), I would start by first creating a new copy of the identity mapping table and switching to it and verifying that the identity mapping still works.

    Later, as a test, create another TTBR0 mapping, this time for a single page, may be in the VA range [0, 0x1000), switch to it, and verify that the translation in that VA range still works. I would not set PXN or XN or other bits which explicitly enforce restrictions, in order to minimize the debugging effort.

    Or, we can also take the identity map, map a new TTBR0 VA range for a page, and see if the new range is accessible.

  • I've managed to move forward, not sure if in the right direction. Turns out I have not set IRG, ORG, and SH for neither TTBR0 or TTBR1 in TCR register. After setting them to inner shareable & W-B, R-A, W-A for both tables, I'm getting Translation fault at level 3.

    So now it's probably due to some flag setting in PTE descriptor that I'm missing? Could wrong MAIR bits have the same effect?

  • I was about to write about the sharebility, since the entries set the field to non-shareable.

    So now it's probably due to some flag setting in PTE descriptor that I'm missing? Could wrong MAIR bits have the same effect?

    Yes, I think so. If we haven't set TCR to appropriate values, we may also have been missing other basic steps in the setup. We need at least MAIR, TCR, TTBRx and SCTLR configured for the mmu to work properly.

    Edit: Can your setup run on QEMU? It does not provide rpi4b emulation yet, but it does have arm64 Virt and rpi3b emulation support.

  • The reads in low memory work if I reassign the identity map to TTBR0. I've switched all of the flags in the new mapping to the same as in the identity map, and it still aborts after putting it on.

    The identity mapping is done via 2MB blocks. If I modify the TTBR0 to map via 2MB blocks too, the abort goes away. Any clues as to why this is?

    ————
    The only way I got it running on QEMU was by compiling rpi4 branch with WIP support, but so far it has proven inconclusive, because there's a lot that of things that just do not work properly.

    ————

    Edit: for the interest of keeping problems separate, I'm going to move this to another thread. This one has been solved by setting proper shareability / IRG & ORG in TCR_EL1. See community.arm.com/.../ttbr0_el1-translation-fault-level-3-on-4kib-blocks---2mib-works