This discussion has been locked.

You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Enable and disable MMU page table caching in L2

XNoOp over 5 years ago

Hello,

I am using a dual core Cortex A9 CPU and I want to enable MMU caching in L2.

By default all the DDR memory region is set as non-cacheable.

But then I want only the DDR regions allocated to the page table to be cacheable. For this purpose I do the following:

1) user the __mmu_tbl_start and __mmu_tbl_end variables from the linker script into my test C script:

extern u32 __mmu_tbl_start;
extern u32 __mmu_tbl_end;

2) Then I set the TTBR0 register so that L2 cache is enabled for MMU page table.

TTBReg = mfcp(XREG_CP15_TTBR0);
mtcp(XREG_CP15_TTBR0, (TTBReg | L2CachingMask));
dsb();
isb();

3) Afterwards I set the memory attributes of the DDR region allocated to the MMU page table (only 16 KB that is contained in a single 1MB section) as follows:

u32 mmu_tbl_start_addr = (u32)&__mmu_tbl_start;
u32 mmu_tbl_end_addr = (u32)&__mmu_tbl_end;
for(u32 pte_address=mmu_tbl_start_addr; pte_address<=mmu_tbl_end_addr;pte_address=pte_address+SECTION_SIZE) {
 Xil_SetTlbAttributes(pte_address, 0x25de2); }

The function Xil_SetTlbAttributes is unmodified from the one provided by Xilinx, and is given below:

void Xil_SetTlbAttributes(INTPTR Addr, u32 attrib)
{
	u32 *ptr;
	u32 section;

	section = Addr / 0x100000U;
	ptr = &MMUTable;
	ptr += section;
	if(ptr != NULL) {
		*ptr = (Addr & 0xFFF00000U) | attrib;
	}

	Xil_DCacheFlush();
	mtcp(XREG_CP15_INVAL_UTLB_UNLOCKED, 0U);
	mtcp(XREG_CP15_INVAL_BRANCH_ARRAY, 0U);

	dsb(); /* ensure completion of the BP and TLB invalidation */
    isb(); /* synchronize context on this processor */
}

4) I try to corrupt the MMU page table as follows:

a) load one PTE in L2 cache (reference to the associated)

b) disable parity check of L2 cache, and modifyto the Page Table Entry memory attributes.

c) Finally enable the L2 cache parity check again, and read the data associated to this PTE.

However as I disable the parity check the CPU raises an Undefined Exception. The exact trigger point is when I write the L2 cache control register to generate a synchronization operation.

An undefined exception is raised by a fetched instruction which can not be decoded by the CPU.

Q1: I am not sure how to debug this any further... This does not seem as explicit as for Data and Prefetch Aborts...

Q2: Do I need to disable and enable again MMU so that cacheability behavior takes effect?

Thank you.

Florian

Top replies

XNoOp over 5 years ago in reply to XNoOp +1 verified

Hello All, I am glad to say that I solved the issue with the Undefined exception. I wrote a handler for the Undefined exception which allowed me to get the PC of the User mode before the CPU switched...

Parents

0 XNoOp over 5 years ago in reply to 42Bastian Schick

Hello Bastian,

I am not very clear about that, because for me they seem to be intertwinned.

Strictly speaking:

- the RGN bits of TTBR0 allow to have hardware table walk take place in the L2 cache.

- the IRGN bits of TTBR0 allow to have hardware table walk take place in the L1 cache.

For Page Table Entry (PTE) with granularity of a 1MB section, table walk only requires to fetch the level 1 PTE directly from the MMU page table. So on the way to TLB, the PTE is stored in L2 cache and/or L1 cache.

If the memory region of the MMU page table is not cacheable, a store to the PTE would be issued directly to back end memory and would disregard the PTE copies already stored in L2 cache and/or L1 cache.

But if the memory region of the MMU page table is cacheable, this store would hit in the cache system.

Is my understanding correct?

Thanks.

Florian
Cancel
Vote up 0 Vote down

Cancel

Reply

0 XNoOp over 5 years ago in reply to 42Bastian Schick

Hello Bastian,

I am not very clear about that, because for me they seem to be intertwinned.

Strictly speaking:

- the RGN bits of TTBR0 allow to have hardware table walk take place in the L2 cache.

- the IRGN bits of TTBR0 allow to have hardware table walk take place in the L1 cache.

For Page Table Entry (PTE) with granularity of a 1MB section, table walk only requires to fetch the level 1 PTE directly from the MMU page table. So on the way to TLB, the PTE is stored in L2 cache and/or L1 cache.

If the memory region of the MMU page table is not cacheable, a store to the PTE would be issued directly to back end memory and would disregard the PTE copies already stored in L2 cache and/or L1 cache.

But if the memory region of the MMU page table is cacheable, this store would hit in the cache system.

Is my understanding correct?

Thanks.

Florian
Cancel
Vote up 0 Vote down

Cancel

Children

+1 XNoOp over 5 years ago in reply to XNoOp

Hello All,

I am glad to say that I solved the issue with the Undefined exception.

I wrote a handler for the Undefined exception which allowed me to get the PC of the User mode before the CPU switched to Undefined mode. I used a global where the PC from the Saved Program Status Register in User mode is stored and is accessible later from the Undefined exception handler executing in Undefined mode.

As a result I could pinpoint that the Undefined exception occurred in the same section as the one where MMU translation table is supposed to be stored. So I guess this piece of code was cacheable as well... But the CPU fetched an invalid instruction (.word 0xFFFFFFFF in the assembly trace)...

My guess is that it fetched the instruction from L2 cache... while it had never been allocated there...

My solution was simple: guarantee that the MMU translation table is not sharing its MMU page within anything else...

For this purpose I forced alignmement of the .mmu_tbl section to 1MB, and forced the total size of the section to be equal to1MB.

Result: I could trigger a Data Abort with TTB Walk L1 Synchronous External Abort condition, and I could check from the L2 cache controller that it was due to a Data RAM parity error at address 0x300444. Since the Translation Table is located between 0x300000 and 0x304000, I could therefore conclude that it was because of the PTE corruption...

Florian
Cancel
Vote up +1 Vote down

Cancel