This discussion has been locked.

You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Arm MMU configuration works on (qemu) raspberry(a53) but not on virt(armv7, a53) board

Now, I’m aware that this is a complex question and might not be resolved here. I am new to embedded/processor programming and I would like to know if there are any major differences between those two boards(cpu wise). I thought that since it is the same CPU (which I think has the same mmu, at least I couldn’t find any information about different versions), my code for the page table config would be universal.

A bit more context, both versions run on exactly the same code, and I am configuring one gigabyte as a section for the boot loaders (which is the current code, the one writing to the page tables) memory.

When I turn on the MMU(after writing the page table to memory and configuring tcr), the next read of the pc causes a “memory, not accessible” exception, but only on the virt board, on the raspi3b board it works just fine.

Best regards Niklas

Top replies

Parents

0 Martin Weidmann over 2 years ago

Translation tables describe the memory system to the processor. So even if the processor is the same, a differing memory system is going to need different tables. A few things you might want to check.

Perhaps obvious, but is memory actually in the same place on both boards?

Are you in the same Exception level and Security state on both systems? There are differences in the table formats based on EL/Security state. For example, in Secure state the descriptors include an NS (or NSTable) bit, which is ignored in Non-secure state.

Are you relying on any implicit configuration? Most system registers in Armv8-A don't have defined reset values, it's up to software to initialise them. It's easy to write code that "works" but is relying on an uninitialised bit (or a bit set up by previous firmware on platform but not another).
Cancel
Up +3 Down

Cancel

Reply

0 Martin Weidmann over 2 years ago

Translation tables describe the memory system to the processor. So even if the processor is the same, a differing memory system is going to need different tables. A few things you might want to check.

Perhaps obvious, but is memory actually in the same place on both boards?

Are you in the same Exception level and Security state on both systems? There are differences in the table formats based on EL/Security state. For example, in Secure state the descriptors include an NS (or NSTable) bit, which is ignored in Non-secure state.

Are you relying on any implicit configuration? Most system registers in Armv8-A don't have defined reset values, it's up to software to initialise them. It's easy to write code that "works" but is relying on an uninitialised bit (or a bit set up by previous firmware on platform but not another).
Cancel
Up +3 Down

Cancel

Children

0 Niklas43 over 2 years ago in reply to Martin Weidmann

Thank you very much for your reply! It's difficult for me to properly debug the problem because of my lack of knowledge about SOCs.

I've done the following:

- checked on the security state and exception levels which are both identical (non secure and el1)

- checked if the configuration registers have had any implicit config that I'm overwriting (which they don't)

- the memory layout and experimented with different sizes/ men ranges.

But none of the above indicated any kind change. The only obvious difference between the two boards is that he raspberry boots into el3 and I'm ereting to el1 pre boot loader, but that shouldn't make a difference, right?

thx again!
Cancel
Up 0 Down

Cancel
0 Martin Weidmann over 2 years ago in reply to Niklas43

Niklas43 said:
The only obvious difference between the two boards is that he raspberry boots into el3 and I'm ereting to el1 pre boot loader, but that shouldn't make a difference, right?

Depends what the firmware is doing before dropping you to EL1. Something should have set up the EL2 registers for example.

When you see exception, can you get the ESR_EL1/FAR_EL1/ELR_EL1 register values generated by the fault? (assuming the exception is taken to EL1)
Cancel
Up +1 Down

Cancel
0 Niklas43 over 2 years ago in reply to Martin Weidmann

Oh, I didn't check the regs bc no exception (I have a gic handler implemented) was thrown, but apparently they still contain information(read them with gdb now).

ESR_EL1 0x86000004 2248146948

(since I have a "parser" for that reg I know that the exception class is "instruction tAbort Taken Without Exception Level Change"(iss(dec): 4) which is an mmu fault, which says illegal instruction access (to the best of my knowledge) so nothing new)

FAR_EL1 0x260 608

ELR_EL1 0x260 608

FAR and ELR are both the same as in the qemu printed exception (which said: "Cannot access memory at address 0x260").

What do you mean with "setting up the EL2 regs", is that a must? I thought that, when using EL1/0 the others(2/3) don't have to be bothered with.

btw. I'm invalidating the mmu tbl (with TLBI VMALLE1IS) and the cache (with IC IALLUIS)
Cancel
Up 0 Down

Cancel
0 Martin Weidmann over 2 years ago in reply to Niklas43

Decoding the ESR value (https://developer.arm.com/documentation/ddi0601/2022-06/AArch64-Registers/ESR-EL1--Exception-Syndrome-Register--EL1-?lang=en#fieldset_0-24_0_12):

EC = 0b100001 = Instruction Abort taken without a change in Exception level.
IFSC = 0b000100 = Translation fault, level 0.
EA = 0 = Not an external fault

So it looks like the MMU table walk is failing early in the process. Now this could be because you have a L0 table entry marked as "Fault", or because something about the table is invalid as these are also reported as L0 faults.

It's a while since I had to write MMU set up code, but here is the list of things the Arm ARM lists as causing L0 translation faults:

R_VZZSZ When one or more of the following apply, a level 0 Translation fault is generated on the relevant translation stage:
• The IA does not map onto a TTBR_ELx address range.
• If the IA maps onto the TTBR0_ELx address range and the IA contains any one bits above the configured IA
size as determined by TCR_ELx.T0SZ.
• If the IA maps onto the TTBR1_ELx address range and the IA contains any zero bits above the configured
IA size as determined by TCR_ELx.T1SZ.
• When a TLB miss occurs, the corresponding TCR_ELx.EPDn field prevents a translation table walk using
TTBRn_ELx.
• When FEAT_E0PD is implemented, the corresponding TCR_ELx.E0PDn field prevents unprivileged access
to an address translated by TTBRn_ELx.
• When FEAT_SVE is implemented, the corresponding TCR_ELx.NFDy field prevents non-faulting
unprivileged accesses to an address translated by TTBRy_ELx.
Cancel
Up +2 Down

Cancel
0 Niklas43 over 2 years ago in reply to Martin Weidmann

ok, found my issue. And actually it was the first thing you said!

The virt board, opposing to the raspberry, has rom (until 1gig) and so I couldn't write to the page tables... Kind of obvious but thank you very much for your thorough help, it really helped me understand the mmu config process better!
Cancel
Up 0 Down

Cancel