This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Cortex A8 preload engine (PLE) error

Note: This was originally posted on 24th November 2011 at http://forums.arm.com

I have a user-mode Linux application running on a Cortex-A8 (a TI 8148 Davinci chip). I have a shared memory region that I'm using to communicate data back and forth between the ARM core and the TI c674x DSP. The shared memory region is a ring buffer made of 32k segments (the size of the 8148's L2 cache ways). I've locked down 3 of the L2 cache ways and I'm trying to use the L2 PLE (preload engine) - the L2 feature accessed through coprocessor 15 c11 - to asynchronously preload and writeback the ring buffer segments. The ring buffer itself is located in physically and virtually contiguous memory - we're using TI's cmem module to allocate out of a memory hole. Moreover, I've checked the linux struct page flags for the ring buffer pages and they seem to all be uniform and fairly kosher. Plain-vanilla loads and stores from the ring buffer work just fine, as do coprocessor 15 based cache writeback operations (performed in privileged mode, of course).

Anyways, everything goes quite nicely for a while (anywhere from 3 to 10 PLE transfers complete successfully), until a PLE transfer errors-out at a page boundary. It's a different page boundary (both virtual and physical address) each time, and it's a different number of ring buffer segments and a different number of pages into the ring buffer segment each time this happens. The error itself, from table 3-132 in the ARM Cortex-A8 Technical Reference Manual, is "b1000101", or "translation fault, section".

Does anyone know what this error means? At first I thought that maybe it was because the page was marked as uncached, but looking at the page properties (with /proc/kpageflags), that doesn't seem to be the case.

Edit: One more detail - this failure only happens with preload operations - not writebacks. Or at least I haven't seen it happen with a writeback yet.
Parents
  • Note: This was originally posted on 30th November 2011 at http://forums.arm.com

    Yes the aim of the ASID is so that you don't have to flush the TLB on context switch. What I am unclear on is what happens when you get a TLB miss when the PLE is running. I assume it would perform a table walk using the current page tables, but populated with the ASID value out of the ContextID register. Which probably isn't what you wanted it to do (I guess you would want it to stop on an ASID mismatch for your usecase).


    According to section 8.4.1 of the TRM the PLE doesn't use the TLB and always walks the page table directly at the start of a transfer and between 4KB boundaries. This should mean that the PLE Context ID register is compared against the global Context ID register, which grants it an entire 32-bits and should avoid potential aliasing. I would guess that you should be setting the PLE Context ID register to the current value of the Context ID register. It could be that the PLE doesn't bother checking equivalence for the first page, hence why you're succeeding until the second page is hit. This would make sense since the first table walk is done before any data is transferred and is then valid for the entire page regardless of whether or not a context switch occurs, and this first table walk may have to succeed before the start operation can finish.

    Section 8.4.5 does claim that if the Context ID register changes during a PLE operation the result is unpredictable.. you would think they're really referring to the PLE Context ID register, since otherwise I don't understand the point of having it in the first place (if the current process one can't change)
Reply
  • Note: This was originally posted on 30th November 2011 at http://forums.arm.com

    Yes the aim of the ASID is so that you don't have to flush the TLB on context switch. What I am unclear on is what happens when you get a TLB miss when the PLE is running. I assume it would perform a table walk using the current page tables, but populated with the ASID value out of the ContextID register. Which probably isn't what you wanted it to do (I guess you would want it to stop on an ASID mismatch for your usecase).


    According to section 8.4.1 of the TRM the PLE doesn't use the TLB and always walks the page table directly at the start of a transfer and between 4KB boundaries. This should mean that the PLE Context ID register is compared against the global Context ID register, which grants it an entire 32-bits and should avoid potential aliasing. I would guess that you should be setting the PLE Context ID register to the current value of the Context ID register. It could be that the PLE doesn't bother checking equivalence for the first page, hence why you're succeeding until the second page is hit. This would make sense since the first table walk is done before any data is transferred and is then valid for the entire page regardless of whether or not a context switch occurs, and this first table walk may have to succeed before the start operation can finish.

    Section 8.4.5 does claim that if the Context ID register changes during a PLE operation the result is unpredictable.. you would think they're really referring to the PLE Context ID register, since otherwise I don't understand the point of having it in the first place (if the current process one can't change)
Children
No data