This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Kernel page table makes page fault although other core already mapped.

Hi, expert. I'm making CacheFlush function by Virtual Address.

I'm using TTBR0 for user area, and TTBR1 for Kernel Area, and I'm using Dual core, Cortex-A9

I'm using Cache Flush Policy as Write Through about Kernel PageTable (Below KPT) itself, and Write Back-Write Allocate about Virtual Address.

When Core0 maps new virtual address (Below VA) in KPT

and Core1 access that VA which is already mapped by Core0,

Sometimes, Core1 makes Data Abort - Page Fault (accssing which refered above).

I try to find out the reason.. but I can't get any idea about this problem...

I'm begging your merciful and wise answer...

Parents
  • Hi levi,

    I already done as you say.

    Each core has same TTBR1 value. (Cuz it is Kernel Area)

    But TTBR0 can be different (It's user Area).

    The problem is that though I used same TTBR1(regardless core) and one core maps VA to TTBR1 but another core can't see that area.

    There are many reasons why this might fail, the most common being improper cache and TLB maintenance. Since you're writing a "CacheFlush" function (we'd love it if you told us if you mean Clean or Clean and Invalidate... there is no "flush" terminology in the ARM Architecture) it may be something is amiss with that. You have, unfortunately, walked into the most difficult to diagnose issue when writing your own OS

    It would be interesting to know how you have mapped the translation tables themselves (i.e. what the descriptors that cover the PA for TTBR0/1 and the subsequent levels are) in terms of inner shareability and cacheability vs. the TTBCR settings for the table walker (they'll need to be *identical* in terms of cacheability and shareability, otherwise the MMU table walker and the core itself when modifying descriptors will be non-coherent).

    What kind of TLB invalidation do you do when modifying the translation table descriptor?

    Are you properly handling the 2 or 3 possible software errata that affect the Cortex-A9 and it's handling of translation table descriptor changes, maintenance broadcasts and propagation to PoU/PoC (check the Software Developers' Errata Notice for your Cortex-A9 revision, depending on which one you have you could be looking at several workarounds, and there may be more that apply to your code)?

    Ta,

    Matt

Reply
  • Hi levi,

    I already done as you say.

    Each core has same TTBR1 value. (Cuz it is Kernel Area)

    But TTBR0 can be different (It's user Area).

    The problem is that though I used same TTBR1(regardless core) and one core maps VA to TTBR1 but another core can't see that area.

    There are many reasons why this might fail, the most common being improper cache and TLB maintenance. Since you're writing a "CacheFlush" function (we'd love it if you told us if you mean Clean or Clean and Invalidate... there is no "flush" terminology in the ARM Architecture) it may be something is amiss with that. You have, unfortunately, walked into the most difficult to diagnose issue when writing your own OS

    It would be interesting to know how you have mapped the translation tables themselves (i.e. what the descriptors that cover the PA for TTBR0/1 and the subsequent levels are) in terms of inner shareability and cacheability vs. the TTBCR settings for the table walker (they'll need to be *identical* in terms of cacheability and shareability, otherwise the MMU table walker and the core itself when modifying descriptors will be non-coherent).

    What kind of TLB invalidation do you do when modifying the translation table descriptor?

    Are you properly handling the 2 or 3 possible software errata that affect the Cortex-A9 and it's handling of translation table descriptor changes, maintenance broadcasts and propagation to PoU/PoC (check the Software Developers' Errata Notice for your Cortex-A9 revision, depending on which one you have you could be looking at several workarounds, and there may be more that apply to your code)?

    Ta,

    Matt

Children
  • At first, Sorry to my late answer to Yasuhiko Koumoto and Matt.

    There are many reasons why this might fail, the most common being improper cache and TLB maintenance. Since you're writing a "CacheFlush" function (we'd love it if you told us if you mean Clean or Clean and Invalidate... there is no "flush" terminology in the ARM Architecture) it may be something is amiss with that. You have, unfortunately, walked into the most difficult to diagnose issue when writing your own OS

    => Sorry, I mean Cache clean and Invalidation.

    It would be interesting to know how you have mapped the translation tables themselves (i.e. what the descriptors that cover the PA for TTBR0/1 and the subsequent levels are) in terms of inner shareability and cacheability vs. the TTBCR settings for the table walker (they'll need to be *identical* in terms of cacheability and shareability, otherwise the MMU table walker and the core itself when modifying descriptors will be non-coherent).

    What kind of TLB invalidation do you do when modifying the translation table descriptor?

    => I set TTB0, TTBR1 and TTBCR as below.

    • TTBR0 - IRGN/RGN:Write-Back Write-Allocate Cacheable, Sharealbe (0x______6A)
    • TTBR1 - IRGN/RGN:Write-Back Write-Allocate Cacheable, Sharealbe (0x______6A)
    • TTBCR - PD1:No, PD0:No, N:4KB (0x0000_0002)

         I'm using Short-descriptor translation table and VA for accessing each translation table entry is being mapped on TTBR1. subsequenct levels is also mapped in the same way.

         About Cahce Policy of translation table (both TTBR0 and TTBR1),  I'm using shareable, Write-back Write-Allocate.

         When translation table is updated,

         At first, Cache clean and invalidation is finished - it's sequence of operations below.

     

         asm volatile (

           DCCIMVAC for VA for entry

           dsb

           isb

         ); -

        

         And then, TLB invalidate is operated - it's sequence of operations below

       

          asm volatile (

              isb

              dsb

              ITLBIALL for VA for entry

              DTLBIALL for VA for entry

              TLBIALLIS for VA for entry

              dsb

               isb

          ); -

    Are you properly handling the 2 or 3 possible software errata that affect the Cortex-A9 and it's handling of translation table descriptor changes, maintenance broadcasts and propagation to PoU/PoC (check the Software Developers' Errata Notice for your Cortex-A9 revision, depending on which one you have you could be looking at several workarounds, and there may be more that apply to your code)?

     

        => Sorry, there was my mistake... I'm using Cortex-A7 not Cortex-A9.

             About Errata related Cache invalidation of Cortex-A9, I already checked, but I can't find any Errata  related Cache invalidation of Cortex-A7.

             I think the cause of this problem is remain information about VA_1 in TLB of Core1 although Core0 mapped VA_1 to translation table for kernel (TTBR1).

             So, I edit my code to operate TLB invalidation before Core1 accesses to VA_1.

             till now, there is no problem happens.

             But I can't be sure it's correct solution because this problem happens sometimes...

         Thx.

     

         Yun,

         Levi.