This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

MMU and Cache configuration

Hello there,

I want to enable MMU and Cache to improve the performance of my arm cortex-A5 core.

I have gone through the Reference manual of arm cortex a5 and found the below step to enable mmu and cache

Steps :

1.Disable cache, branch predictors

2. Invalidate cache and TLB

3. set translation table entries , point ttbr register to translation table.

4.enable cache, branch predictor

5. enable MMU.

Queries

1. I need help to understand how the translation table entries are defined.. If i choose 1MB section option then

how can i make entry for External flash memory which is of 16MB. 

Here we have different memory regions, internal Ram(1MB), internal flash(4MB), External flash(16MB) and so on..

How can i define translation table entries for the above mentioned memory regions.

2.  Translation tables are stored in main memory, here main memory is RAM ???

3 How can we check all the above steps working or not in trace32 ??

by executing the command cache.view i am able to see the cache content but i am unable to understand it.

Thanks in Advance.

ZbinAhmed

Parents
  • First of all, read ARM Cortex-A series programmer's guide for ARMv7, chapter "The Memory Management Unit".

    1) External flash memory directly or indirectly addressed (via flash controller)? First level translation table occupies 4096 32-bit entries in memory, one 32-bit entry for 1MB section. For 16MB of memory you have to configure 16 entries in the 1-level translation table. In the chapter "The Memory Management Unit" you can find examples of translation and entry configuration.

    For example, code fragments (for Cortex A5 Atmel Sama5d2):

    ...

    ALIGNED(16384) static uint32_t tlb[4096]; // translation table allocated in main memory

    ...

    uint32_t addr;

    /* Reset table entries */
    for (addr = 0; addr < 4096; addr++)
    tlb[addr] = 0;

    ...

    /* 0x00000000: ROM */
    tlb[0x000] = TTB_SECT_ADDR(0x00000000)
    | TTB_SECT_AP_READ_ONLY
    | TTB_SECT_DOMAIN(0xf)
    | TTB_SECT_EXEC
    | TTB_SECT_CACHEABLE_WB
    | TTB_TYPE_SECT;

    /* 0x00100000: NFC SRAM */
    tlb[0x001] = TTB_SECT_ADDR(0x00100000)
    | TTB_SECT_AP_FULL_ACCESS
    | TTB_SECT_DOMAIN(0xf)
    | TTB_SECT_EXEC
    | TTB_SECT_SHAREABLE_DEVICE
    | TTB_TYPE_SECT;

    /* 0x00200000: SRAM */
    tlb[0x002] = TTB_SECT_ADDR(0x00200000)
    | TTB_SECT_AP_FULL_ACCESS
    | TTB_SECT_DOMAIN(0xf)
    | TTB_SECT_EXEC
    | TTB_SECT_CACHEABLE_WB
    | TTB_TYPE_SECT;

    ...

    2) Yes, RAM is main memory.

Reply
  • First of all, read ARM Cortex-A series programmer's guide for ARMv7, chapter "The Memory Management Unit".

    1) External flash memory directly or indirectly addressed (via flash controller)? First level translation table occupies 4096 32-bit entries in memory, one 32-bit entry for 1MB section. For 16MB of memory you have to configure 16 entries in the 1-level translation table. In the chapter "The Memory Management Unit" you can find examples of translation and entry configuration.

    For example, code fragments (for Cortex A5 Atmel Sama5d2):

    ...

    ALIGNED(16384) static uint32_t tlb[4096]; // translation table allocated in main memory

    ...

    uint32_t addr;

    /* Reset table entries */
    for (addr = 0; addr < 4096; addr++)
    tlb[addr] = 0;

    ...

    /* 0x00000000: ROM */
    tlb[0x000] = TTB_SECT_ADDR(0x00000000)
    | TTB_SECT_AP_READ_ONLY
    | TTB_SECT_DOMAIN(0xf)
    | TTB_SECT_EXEC
    | TTB_SECT_CACHEABLE_WB
    | TTB_TYPE_SECT;

    /* 0x00100000: NFC SRAM */
    tlb[0x001] = TTB_SECT_ADDR(0x00100000)
    | TTB_SECT_AP_FULL_ACCESS
    | TTB_SECT_DOMAIN(0xf)
    | TTB_SECT_EXEC
    | TTB_SECT_SHAREABLE_DEVICE
    | TTB_TYPE_SECT;

    /* 0x00200000: SRAM */
    tlb[0x002] = TTB_SECT_ADDR(0x00200000)
    | TTB_SECT_AP_FULL_ACCESS
    | TTB_SECT_DOMAIN(0xf)
    | TTB_SECT_EXEC
    | TTB_SECT_CACHEABLE_WB
    | TTB_TYPE_SECT;

    ...

    2) Yes, RAM is main memory.

Children
  • Thanks van,

    I will go through it.

    Suppose we have two 512kb of SRAM region (SRAM0, SRAM1), do we need to make entry for them separately ?

    The addresses you mentioned in the section field, are those physical address of the memory modules ?

    Can you pls take some time and explain with a diagram of VA and PA mapping for the above entries you mentioned..

    How the VA to PA translation takes place?

    For example, code fragments (for Cortex A5 Atmel Sama5d2): ---> Can you give me the link for the source code and if there is any document.

    Thanks you so much.

  • Suppose we have two 512kb of SRAM region (SRAM0, SRAM1), do we need to make entry for them separately? - it depends of your choice, you can configure this memory region using 2-level translation table with 4KB or 64KB pages.

    The addresses you mentioned in the section field, are those physical address of the memory modules? - in the above example translation table entries are flat mappings, VA = PA.

    How the VA to PA translation takes place? - Later I will write few examples of translation and post them into reply to this topic.

    https://www.microchip.com/Developmenttools/ProductDetails/ATSAMA5D2C-XULT

    Download IAR Software package 2.13 for EWARM from there, unpack and get the file "\target\sama5d2\board_support.c"

  • I have mentioned the address for RAM, ROM , in this case how we can map onto virtual memory ?

    Is flat mapping enabled by default , even if we enable MMU, ?

    If flat mapping is enabled, then the VA for RAM will be same as the PA defined ? (0x3EF00000 - 0x3EFFFFFF)

    What address goes in the section field in translation table entry ( 0x3ef00000) ???

    For ROM, should there be multiple entries in Translation table or only one entry is required?

    Inside ROM, there are sections defined for boot code, application code and others, do we need to make entries for them ?

    Thanks Van

  • I have mentioned the address for RAM, ROM , in this case how we can map onto virtual memory? - very easy, just carefully map and make right configuration of attributes))

    Is flat mapping enabled by default , even if we enable MMU, ? - If MMU is disabled then CPU operates by physical addresses (flat mapping), only if you enable MMU, CPU will operate by virtual addresses. Example code above was written for demo project, CPU in that case operates virtual addresses but with flat mapping configured in MMU translation table.

    If flat mapping is enabled, then the VA for RAM will be same as the PA defined ? (0x3EF00000 - 0x3EFFFFFF) - Yes. When the MMU is disabled, all virtual addresses map directly to the corresponding physical address (a flat mapping).

    What address goes in the section field in translation table entry ( 0x3ef00000) ??? - I can't understand this question(

    For ROM, should there be multiple entries in Translation table or only one entry is required? - its your choice how multiple entries will be in the translation table, you can hide almost all addresses just writing zeros to table entries, but for example for 4MB ROM you have to map 4 entries of 1MB each for 1-level translation table.

    Inside ROM, there are sections defined for boot code, application code and others, do we need to make entries for them? - I am not expert of writing tables for MMU (I only know how MMU works and translates addresses), but I think the answer is yes, you have to configure properly all pages that will be used during CPU operation.

    Example of mapping:

    Suppose we define memory for table

    ALIGNED(16384) static uint32_t tlb[4096];

    CPU generates VA = 0x0020 0404, high 12 bits (0x002) is the address of entry in 1-level translation table (in this example is the entry #2 in the table), low 20 bits (0x00404) will be used as offset in the page.

    suppose TTBR = 0xFFFF 0000, this is the address of 1-level translation table, then

    0xFFFF 0000 + (0x002 * 4) = 0xFFFF 0008 this is the address of entry in table, in turn entry gives us part of physical address and attributes for memory section

    suppose we configure table entry #2 like:

    tlb[0x002] = TTB_SECT_ADDR(0x00800000)
    | TTB_SECT_AP_FULL_ACCESS
    | TTB_SECT_DOMAIN(0xf)
    | TTB_SECT_EXEC
    | TTB_SECT_CACHEABLE_WB
    | TTB_TYPE_SECT;

    then PA = 0x00800000 (high 12 bits of TTB_SECT_ADDR(0x00800000) ) + 0x00404 (offset in the page) = 0x00800404

  • The addresses you mentioned in the section field, are those physical address of the memory modules ? - yes those are physical address of memory modules

  • If you want to fundamentally understand the purpose of memory management I recommend you to read the book "Modern Operating Systems" Andrew S. Tanenbaum, particularly the chapter "Memory management".

  • Van,

    I went through the programmers guide for cortex-A5, and its pretty clear now.

    I thought to define Translation table entries for 1MB sections and give a try..  But check the below scenario

    For 1MB SRAM with two sections of 512kb each, if  2 entries are made

    tlb[0] = sect(0x3ef00000)   /* This is for SRAM0 */  Here we say section size is 1MB and when we define memory attributes, it will be applicable to 1MB area  instead of 512KB?

    And for the below SRAM1 area, same attributes will be applicable as it comes under the 1MB section for which attributes are defined above with SRAM0.. 

    Then is it correct to do so.. or we need to go for small page sizes for 512KB or lower memory sizes.

    tlb[1] = sect(0x3ef80000)  /*   This is for SRAM1 */

    coming to larger memory size flash 4MB, four entries can be made

    tlb[2] = sect(0x18000000) /* Each 1MB section will have its own attributes, correct ?   */

    tlb[3] = sect(0x18100000)

    tlb[4] = sect(0x18200000)

    tlb[5] = sect(0x18300000)

    Understanding:

    When we say 1MB section , then one entry cannot be done for 128KB memory region ? In this case we need to go for small page size like 2 64kb ?

    is my understanding correct Van.

    Thanks.

  • tlb[0] = sect(0x3ef00000)   /* This is for SRAM0 */  - this is correct, in this case virtual addresses in range 0x00000000 - 0x000FFFFF are mapped to physical addresses in range 0x3EF00000 - 0x3EFFFFFF and also in this case SRAM0 and SRAM1 both are mapped into the tlb[0] entry (512KB + 512KB = 1MB). Translation table entries of 1-level translation can't be smaller than 1 MB. If you want to divide this address space section to smaller chunks you have to define 2-level translation table and make entry from 1-level table to point to 2-level translation table. Further in 2-level translation table you can divide your address space into pages of size 64KB or 4KB.

    tlb[1] = sect(0x3ef80000)  /*   This is for SRAM1 */ - it is not correct

    other definitions seems to be correct

  • ok..

    If there is anything else, will get back to you.

    Thank you so much Van.

  • Hi Van,

    I have made translation table entries, flashed the app and debugging it. And when i enable mmu, translation/section fault occurs.

    The memory dump is getting locked ( all ? marks).

    I am using 1MB section, so here we will have 4096 entries for L1 translation table.

    Can i use any 'x' entry to hold information about any 'y' 1MB section ???

    Way 1: starting from the address 0x0000_0000 to 0xFFFF_FFFF

    entry[0x000] = sect[0x00000000 - 0x000FFFFF]   //section 1

    entry[0x001]= sect[0x00100000 - 0x001FFFFF]   //section2 

    entry[0x002]= sect[0x00200000 - 0x002FFFFF]   //section3 

    Way 2:

    entry[0x000] = sect[0x18F00000 - 0x18FFFFFF]

    entry[0x001]= sect[0x19000000 - 0x190FFFFF]

    entry[0x002] = sect[0x19100000-0x191FFFFF]

    Are the above mentioned ways valid ???

    FYI : ARM Technical Reference Manual

    http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0333h/Babicjaf.html

    6.9.3. Translation fault

    There are two types of translation fault:

    Section

    A section translation fault occurs if:

    • The TLB tries to perform a page table walk but the page table walk is disabled by one of the PD0 or PD1 bits. For more details, see Hardware page table translation.

    • The TLB fetches a first level translation table descriptor, and this first level descriptor is invalid. This is the case when bits[1:0] of this descriptor are b00 or b11.

    1. I have checked, It is not because of PD0 bit. 

    2. And for the underlined point, is it referring to 1st two lower bits of entry ?? ,

    because i am using section type entry, the lower 2 bits are "10"

    Can you please suggest something.

    Thanks

  • Can you send the code with MMU configuration to my email (you can find email in my contacts)?