This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

About TRACE32, DMA, and CACHE

Note: This was originally posted on 17th March 2009 at http://forums.arm.com

Hi ALL:

       i have some questions about the TRACE32, DMA, and CPU CACHE.
       i am now using the OMAP331 MPU, which contains an ARM926ejs core, and also has DMA controller.

       i use DMA to read data from peripheral NAND flash to SDRAM(address from 0x11700000). and the result is as below:

       (1).  use TRACE32 to load program into sdram and run it, call function to start DMA to read data from nand flash to SDRAM, the read data is stored from address 0x11700000. after DMA read finish, i use "d.dump 0x11700000" to see the content of sdram, and everything is ok. the content of 0x11700000 is the real content inside the nand flash.
        (2).  if i first use "d.dump 0x11700000" command to show the content of sdram, and then start DMA transfer, then after DMA finish, the content of sdram does not change, i guess that it is because of the cache of cpu, so i flush the cache, then content is updated immediatly on trace32 .
my question is : when trace32 asks arm9 to give content of address(0x11700000), how does arm9 answer trace32 ?

        (3). first use "d.dump 0x11700000" command to show the content of sdram. and then start DMA transfer. after DMA finish, flush cache, so the content of sdram is updated. then execute printf() instruction in program to print content of address 0x11700000 to hyper terminal, we assume that the content shown on pc is 0x12345678. then i use trace32 to change the value of 0x11700000 from 0x12345678 to 0x0. and then start DMA transfer once again, i means to read the same blocks from nand flash to sdram 0x11700000. after DMA finish, flush cache. but the content of 0x11700000 is not updated, it is still 0x0!!! it should be 0x12345678.   so i think it is because of the instruction "printf()" influences the cache about address 0x11700000, but who can give me more precise explanation about this result?

       (4). first use "d.dump 0x11700000" command to show the content of sdram. and then start DMA transfer. after DMA finish, flush cache, so the content of sdram is updated. then execute printf() instruction in program to print content of address 0x11700000 to hyper terminal, we assume that the content shown on pc is 0x12345678. then flash cache again. then i use trace32 to change the value of 0x11700000 from 0x12345678 to 0x0. and then start DMA transfer once again, i means to read the same blocks from nand flash to sdram 0x11700000. after DMA finish, flush cache. the content of 0x11700000 changed from 0x0 to 0x12345678. it is right.
here i flush cpu cache for three times. between the first two actions is a printf() instruction. so i think that this instruction influences the cache about address 0x11700000,

      at last, many questions in my mide:
     1. does dma transfer using cpu cache?
     2. how does cpu answer trance32 to give content of address?
     3. when using trace32 to change content of address, what does cpu do in the same time? will it updata its cache also?
     4. after dma transfer, the content of some address will be changed, will cpu check these address and update the cache?
  • Note: This was originally posted on 17th March 2009 at http://forums.arm.com

    i am so sorry, i need to correct my desciption of result (4) as below:

    (4). first use "d.dump 0x11700000" command to show the content of sdram. and then start DMA transfer. after DMA finish, flush cache, so the content of sdram is updated. then execute printf() instruction in program to print content of address 0x11700000 to hyper terminal, we assume that the content shown on pc is 0x12345678. then i use trace32 to change the value of 0x11700000 from 0x12345678 to 0x0. then flash cache again. . and then start DMA transfer once again, i means to read the same blocks from nand flash to sdram 0x11700000. after DMA finish, flush cache. the content of 0x11700000 changed from 0x0 to 0x12345678. it is right.
    here i flush cpu cache for three times. between the first two actions is a printf() instruction. so i think that this instruction influences the cache about address 0x11700000,
  • Note: This was originally posted on 19th March 2009 at http://forums.arm.com

    Is there any people who can help me?

    thanks very much for your help.
  • Note: This was originally posted on 19th March 2009 at http://forums.arm.com

    Is there any people who can help me?

    thanks very much for your help.


    Apparently, not. Have you tried Lauterbach support first?
  • Note: This was originally posted on 21st March 2009 at http://forums.arm.com

    Hi Tum:

         thank you for your advice.
         i did not ask Lauterbach for support, i think maybe somebody here have the same experience as me and they know why this occur.

         this is not problem because i know how to avoid it, i only want to know why it occurs.

    Best Regards
  • Note: This was originally posted on 23rd March 2009 at http://forums.arm.com

    Hi sim:

        thank you very much for your answer, this is really what i want to know.

        i am now marking the buffer which DMA using as uncachable and unbufferable, as you said, it is more robust.

        thank you again. :=)


    Best Regards
    Bu, YiTian
  • Note: This was originally posted on 21st March 2009 at http://forums.arm.com

    I'm not familiar with Trace32 or OMAP331, however, assuming Trace32 is just using the ARM926's scan-based debug, and that the DMA is simply a separate master on a shared interconnect, would produce the effect you see.

    Using scan debug to read and write memory results in the core being halted, and the debugger effectively forcing the 926 to perform read and write operations, thus cacheable values will read/write into the cache as per any normal 926 load/store, i.e. the debugger's view is the 926's view.

    The 926 doesn't support cache coherency with external masters, therefore, pages marked as cacheable will require cleaning before it can be guaranteed that the DMA can see the result of all stores from the 926, likewise pages marked as cacheable will need invalidation to guarantee that values written by the DMA can be seen by the 926 - thus you don't see the DMA written values from the debugger until a cache invalidate has been performed. A more robust solution is to mark the DMA pages as non-cacheable.

    So, summarising the answers to your questions:

    1. No.
    2. Via proxied load/store operations
    3. The core is performing the load/store
    4. No, 926 supports software cache coherency only

    hth
    s.