This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Kernel page table makes page fault although other core already mapped.

Hi, expert. I'm making CacheFlush function by Virtual Address.

I'm using TTBR0 for user area, and TTBR1 for Kernel Area, and I'm using Dual core, Cortex-A9

I'm using Cache Flush Policy as Write Through about Kernel PageTable (Below KPT) itself, and Write Back-Write Allocate about Virtual Address.

When Core0 maps new virtual address (Below VA) in KPT

and Core1 access that VA which is already mapped by Core0,

Sometimes, Core1 makes Data Abort - Page Fault (accssing which refered above).

I try to find out the reason.. but I can't get any idea about this problem...

I'm begging your merciful and wise answer...

Parents
  • Hello,

    I think you are misunderstanding.

    Each core has TTBR0 and TTBR1, and an address translation is performed by each MMU. Therefore you should make the same mapping for the core1 as the core0 mapping.

    By the way, the write through policy is identical to the uncached at Cortex-A9.

    Best regards,

    Yasuhiko Koumoto.

Reply
  • Hello,

    I think you are misunderstanding.

    Each core has TTBR0 and TTBR1, and an address translation is performed by each MMU. Therefore you should make the same mapping for the core1 as the core0 mapping.

    By the way, the write through policy is identical to the uncached at Cortex-A9.

    Best regards,

    Yasuhiko Koumoto.

Children
  • I already done as you say.

    Each core has same TTBR1 value. (Cuz it is Kernel Area)

    But TTBR0 can be different (It's user Area).

    The problem is that though I used same TTBR1(regardless core) and one core maps VA to TTBR1 but another core can't see that area.

  • Hello,

    did you set TTBCR.PD0 (or bit 4) to '1' before TTBR1 address translation?

    Best regards,

    Yasuhiko Koumoto.

  • I try to use TTBR0 and TTBR1 both.. so if some VA is specified range, I use TTBR1 or TTBR0.

    In case of me, I'm using TTBR1 for KERNEL and TTBR0 for USER.

    So Both TTBR1 and TTBR0 are already enabled.

  • My understanding from ARMARM is that TTBR0 and TTBR1 would be exclusive. Therefore I think it would be impossible that both TTBR0 and TTBR1 are enabled at the same time. The switch of TTBR0 and TTBR1 would be TTBCR.PD0. Isn't my understanding correct?

    Also please teach me how you enabled both TTBR0 and TTBR1?

    Best regards,

    Yasuhiko Koumoto.

  • Sorry to my late answer yasuhikokoumoto.

    I think it's better to see ARM Architecture manual ArmV7-A nad ArmV7-R.

    In this doc, section B4.1.143. It explains about TTBCR register, and PD0 and PD1 flags can be set each one not exclusive.

    and Section B.3.5.4 explains How to use TTBR0 and TTBR1 both well.

    Please refer these section. Thank you very much.

  • Hello,

    thank you for your input and correctiong me.

    I had been misunderstood.

    Let's go to the beginning.

    Please let us know your setting of TTBR0, TTBR1 and TTBCR of each core.

    Also let us know the VA which had occurred the problem.

    Best regards,

    Yasuhiko Koumoto.

  • Hi levi,

    I already done as you say.

    Each core has same TTBR1 value. (Cuz it is Kernel Area)

    But TTBR0 can be different (It's user Area).

    The problem is that though I used same TTBR1(regardless core) and one core maps VA to TTBR1 but another core can't see that area.

    There are many reasons why this might fail, the most common being improper cache and TLB maintenance. Since you're writing a "CacheFlush" function (we'd love it if you told us if you mean Clean or Clean and Invalidate... there is no "flush" terminology in the ARM Architecture) it may be something is amiss with that. You have, unfortunately, walked into the most difficult to diagnose issue when writing your own OS

    It would be interesting to know how you have mapped the translation tables themselves (i.e. what the descriptors that cover the PA for TTBR0/1 and the subsequent levels are) in terms of inner shareability and cacheability vs. the TTBCR settings for the table walker (they'll need to be *identical* in terms of cacheability and shareability, otherwise the MMU table walker and the core itself when modifying descriptors will be non-coherent).

    What kind of TLB invalidation do you do when modifying the translation table descriptor?

    Are you properly handling the 2 or 3 possible software errata that affect the Cortex-A9 and it's handling of translation table descriptor changes, maintenance broadcasts and propagation to PoU/PoC (check the Software Developers' Errata Notice for your Cortex-A9 revision, depending on which one you have you could be looking at several workarounds, and there may be more that apply to your code)?

    Ta,

    Matt

  • At first, Sorry to my late answer to Yasuhiko Koumoto and Matt.

    There are many reasons why this might fail, the most common being improper cache and TLB maintenance. Since you're writing a "CacheFlush" function (we'd love it if you told us if you mean Clean or Clean and Invalidate... there is no "flush" terminology in the ARM Architecture) it may be something is amiss with that. You have, unfortunately, walked into the most difficult to diagnose issue when writing your own OS

    => Sorry, I mean Cache clean and Invalidation.

    It would be interesting to know how you have mapped the translation tables themselves (i.e. what the descriptors that cover the PA for TTBR0/1 and the subsequent levels are) in terms of inner shareability and cacheability vs. the TTBCR settings for the table walker (they'll need to be *identical* in terms of cacheability and shareability, otherwise the MMU table walker and the core itself when modifying descriptors will be non-coherent).

    What kind of TLB invalidation do you do when modifying the translation table descriptor?

    => I set TTB0, TTBR1 and TTBCR as below.

    • TTBR0 - IRGN/RGN:Write-Back Write-Allocate Cacheable, Sharealbe (0x______6A)
    • TTBR1 - IRGN/RGN:Write-Back Write-Allocate Cacheable, Sharealbe (0x______6A)
    • TTBCR - PD1:No, PD0:No, N:4KB (0x0000_0002)

         I'm using Short-descriptor translation table and VA for accessing each translation table entry is being mapped on TTBR1. subsequenct levels is also mapped in the same way.

         About Cahce Policy of translation table (both TTBR0 and TTBR1),  I'm using shareable, Write-back Write-Allocate.

         When translation table is updated,

         At first, Cache clean and invalidation is finished - it's sequence of operations below.

     

         asm volatile (

           DCCIMVAC for VA for entry

           dsb

           isb

         ); -

        

         And then, TLB invalidate is operated - it's sequence of operations below

       

          asm volatile (

              isb

              dsb

              ITLBIALL for VA for entry

              DTLBIALL for VA for entry

              TLBIALLIS for VA for entry

              dsb

               isb

          ); -

    Are you properly handling the 2 or 3 possible software errata that affect the Cortex-A9 and it's handling of translation table descriptor changes, maintenance broadcasts and propagation to PoU/PoC (check the Software Developers' Errata Notice for your Cortex-A9 revision, depending on which one you have you could be looking at several workarounds, and there may be more that apply to your code)?

     

        => Sorry, there was my mistake... I'm using Cortex-A7 not Cortex-A9.

             About Errata related Cache invalidation of Cortex-A9, I already checked, but I can't find any Errata  related Cache invalidation of Cortex-A7.

             I think the cause of this problem is remain information about VA_1 in TLB of Core1 although Core0 mapped VA_1 to translation table for kernel (TTBR1).

             So, I edit my code to operate TLB invalidation before Core1 accesses to VA_1.

             till now, there is no problem happens.

             But I can't be sure it's correct solution because this problem happens sometimes...

         Thx.

     

         Yun,

         Levi.

  • At first, Sorry to my late answer to Yasuhiko Koumoto and Matt.

    the problem happens on Kernel Address area.

    But I think the hex address is not useful information to you

    Thx,

    Yun,

    Levi