Hi ARM experts,
I have a problem in using armv8 mmu in bare-metal system:
When using the 4KB translation granule, level 1 table which use D_Block convert VA to 1GB region PA.
In Armv8 ARM page D4-1744, table lookup starts at level 0.
Is the Level 0 table a essential step to map the PA?
May I bypass the level 0 table and do mmu conversion via level 1 translation table? In other words, directly fill the L1 table address in TTBR0 register.
Thanks!!
Hello,
Yes, this is possible, but you will need to correctly configure the translation regime in order for it to work.
A translation table is defined as occupying exactly one granule of memory, so at 4KB granularity each table occupies 4KB. Each entry in a table is a 64-bit descriptor, i.e. 8 bytes, so there are 512 entries per table in this case. As you mentioned, each entry in a 4KB granularity L1 table maps 1GB of memory. Therefore, a complete L1 table will map 512GB (512 entries * 1GB per entry).
So providing the size of your virtual address space is no greater than 512GB, you only need a single L1 table. If, however, your virtual address space is greater than 512GB, you will need a second L1 table, and at that point you will also need an L0 table in order to select between the two L1 tables.
The size of the virtual address space is controlled by the TCR_ELx.T0SZ field, with the size being defined (2 ^^ (64 - T0SZ)). A 512GB virtual address space is 39 bits, so the T0SZ field would need to be 25 or greater. Note that this concept can be taken further - If you reduce the size of the virtual address space to between 25 and 30 bits inclusive, at 4KB granularity the translation regime will actually start at level 2, so TTBR0_ELx will point to the base physical address of an L2 table in that case.
I hope that helps,
Ash.
Hi Ash,
So, if the virtual address space is 512G, and use 4KB granularity L1 table, are the 4K bytes enough for MMU table in this case? I mean that need only one L1 4KB table can mapping all 512G virtual space, and push this L1 table address to TTBR0_ELx, is it right?
One more question, does the number of T0SZ depend on the number of PARange of id_aa64mmfr0_el1 register in flat mapping case? For example, if PARange indicates the physical address is 40 bits (1TB), must the T0SZ filed be 24 in flatting mapping case? In such flat mapping case, if software can confirm it only access the low 1G virtual space, could i just use a L2 4K granularity table to map 1G low virtual space?
tristan wrote: So, if the virtual address space is 512G, and use 4KB granularity L1 table, are the 4K bytes enough for MMU table in this case? I mean that need only one L1 4KB table can mapping all 512G virtual space, and push this L1 table address to TTBR0_ELx, is it right?
tristan wrote:
Correct, providing you have correctly configured the SCTLR_ELx.T0SZ field to limit the size of the virtual address space to 512GB.
One more question, does the number of T0SZ depend on the number of PARange of id_aa64mmfr0_el1 register in flat mapping case? For example, if PARange indicates the physical address is 40 bits (1TB), must the T0SZ filed be 24 in flatting mapping case?
That depends on whether there is anything actually accessible in the top 512GB of the 1TB physical address space. Take for example the Juno ARM Development Platform board's memory map, which is 40 bits (1TB), but the top 512GB are Reserved. In this case, for flat mapping we can actually just limit the size of virtual address space to 39 bits (512GB), and then any attempt to translate a virtual address outside of this range will result in a fault.
However, a key thing to note here is that you would see an Address Size Fault as opposed to a Translation Fault. If you would want virtual addresses in the top 512GB to cause Translation Faults then you would need to make the virtual address space also be 40 bits, and then have an L0 table with the 1st entry being a table descriptor pointing to your L1 table, and the 2nd entry being a fault descriptor.
In such flat mapping case, if software can confirm it only access the low 1G virtual space, could i just use a L2 4K granularity table to map 1G low virtual space?
Correct, you could limit the size of the virtual address space to be between 25 and 30 bits inclusive, and that would allow TTBR0_ELx to point to an L2 table. Although keep in mind that what I said above also applies here; virtual addresses in the top 999GB will cause Address Size Faults as opposed to Translation Faults. If this matters to you then you'll need to match the sizes of the virtual and physical address spaces and then use fault descriptors where necessary.
So if the virtual address is space,I can directly use level 1 table.
I think maybe the descriptor value or TCR_EL1 value that I use is not correct.
Thanks for your reply!
So if the virtual address is smaller than 512G,I can directly use level 1 table.
Yes, providing you have correctly set the value of TCR_ELx.T0SZ to limit the size of the virtual address space to be between 31 and 39 bits inclusive - this corresponds to a T0SZ value of between 25 and 33 inclusive, as the size of the virtual address space is defined as being equal to (2 ^^ (64 - T0SZ)). Also, keep in mind that translations at EL1/EL0 will use either TTBR0_EL1 if the top 12 bits [63:48] of the virtual address are all 0, or TTBR1_EL1 if the top 12 bits [63:48] of the virtual address are all 1. When reducing the size of the TTBR0_EL1 virtual address space, the base stays at 0x0 and the limit moves, whereas when reducing the size of the TTBR1_EL1 virtual address space, the base moves and the limit stays at 0xFFFFFFFF,FFFFFFFF. With this in mind, you need to be careful with the virtual addresses that you are using, and take note of which TTBR register is being used to translate them.
Are you taking a Synchronous Data Abort when attempting to access a virtual address with the MMU turned on? Please can you show me the value of the ESR_EL1 (Exception Syndrome Register EL1) at the point that the exception is taken? There are a number of things that could be going wrong, the DFSC field (bits [5:0] of ESR_EL1 for a Synchronous Data Abort) will help to narrow down the issue.
Ash
Thanks a lot, Ash!
If the virtual address is smaller than 512G, are there any limitations for using level 0 table?
For example: the virtual address is 512G, the first 8M bytes space is "Normal Memory", and others are "Device", uses 4KB granularity table, i tried the MMU table like following:
1 The first entry (512G space) of level 0 translation table is a table entry, point to level 1 table, other entries of level 0 are valid;
2 The first entry (1G space) of level 1 table is a table entry, point to the level 2 table, other entries are all block entries with "Device" attribute;
3 The first 4 entries (2M * 4 space) of level 2 table is block entries with "Normal Memory" attribute while other entries are "Device" attribute.
I met system hang after enabling MMU if pushed the base address of level 0 table to TTBR0, while system work well if pushed the level 1 table base address. I am tracking now but did still not find the root cause till now.
Does any limitation exist? Thanks!
Hi,
If you've set TCR_EL1.T0SZ in such a way that your virtual address space is configured to be 512GB, then what you are describing is most likely the expected behaviour. As outlined in my earlier reply, doing this at 4KB granularity will cause translations to start at L1, in other words, the translation table pointed to by TTBR0_EL1 will be interpreted as an L1 table, rather than an L0 table. This means that the first entry in your L0 table is actually mapping only 1GB, not 512GB like you think it is, because the L0 table is being interpreted as an L1 table. If all other entries in the table are fault descriptors then you'll be getting a First Level Translation Fault.
We can confirm this by looking at the value of the ESR_EL1 register. Please can you single step the write to SCTLR_EL1 that is enabling the MMU, and then provide the value of ESR_EL1?
Ash,
Thanks for your reply, it is very helpful!!
I think I misunderstood the processing of "walk" of translation tables before. I missed the T0SZ affection to translation table walk. As my understanding now, the number of T0SZ identifies the virtual address space, and the virtual address space also determines from which level the init lookup start. In case that T0SZ >= 25 (virtual address space <= 512G), like you said above, "doing this at 4KB granularity will cause translations to start at L1".
Correct
And to go even further, at 4KB granularity, setting T0SZ to be >= 34 (so that the virtual address space is <= 1 GB) will cause translations to start at L2, because a single L2 table at 4KB granularity maps 1GB (512 entries * 2MB per entry).
Got it, many thanks! Ash.
The value of ESR_EL1 is 0x96000046.It's a exception of data abort.
So I think maybe it's the descriptor value of level table is not correct.
An ESR_EL1 value of 0x96000046 corresponds to a 2nd level translation fault that occurred on a write instruction.
If you're using DS-5, I suggest you use the MMU view to ensure your translation tables have been correctly configured. Once the translation tables have been programmed and TCR_EL1 has been configured, in DS-5's debug view navigate to: Window -> Show View -> MMU.
When you take the data abort, you can check the FAR_EL1 (Faulting Address Register) to get the virtual address that couldn't be translated, and use the MMU view to narrow down the issue. In particular, the Memory Map tab of the MMU view will help to quickly spot any issues.
Ash Wilding Why Block descriptor is not permitted in level-0 4K granule ?