This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

TCM alternative in A15

Note: This was originally posted on 19th October 2012 at http://forums.arm.com

Hi

We want to use an A15 core for some packet processing function with a really high throughput. This means that  every packet is going to cause a cache miss if we use the regular caching mechanism.
In order to avoid that, we could two things

1. Use TCMs. But these are not present in the Cortex A series. Does anyone know an alternative to this that A15 supports?
2. Stash the data directly into the L1 cache. But I could not find any information whether this is possible.

Can any one suggest what would be the best way to achieve this high throughput packet processing using A15?

Thanks
Sundeep
Parents
  • Note: This was originally posted on 19th October 2012 at http://forums.arm.com

    TCM wouldn't solve that anyway by it self right? You still have to get the data into TCM somehow - it doesn't magically fill up with new packets. This is either done by the ARM core (so no better than loading into cache, but beneficial if you use the data in the TCM many times), or some form of DMA engine which can load the data into an unused potion of TCM while the ARM is off doing some other work.

    For A15 you don't have the option of a TCM, but you have two options which work in a similar way to the TCM+DMA that I can see.

    * Use cached memory, and issue "PLD" instructions or program the L2 cache preload engine "PLE" to load the data into the cache just ahead of when you need it. This moves the cache miss off the critical path, so it doesn't slow down the execution of the code your core is running.
      * Alternatively you can put a DMA engine on the ACP port and use the DMA instead of the PLE to pull data into the L2 cache.

    HTH,
    Iso
Reply
  • Note: This was originally posted on 19th October 2012 at http://forums.arm.com

    TCM wouldn't solve that anyway by it self right? You still have to get the data into TCM somehow - it doesn't magically fill up with new packets. This is either done by the ARM core (so no better than loading into cache, but beneficial if you use the data in the TCM many times), or some form of DMA engine which can load the data into an unused potion of TCM while the ARM is off doing some other work.

    For A15 you don't have the option of a TCM, but you have two options which work in a similar way to the TCM+DMA that I can see.

    * Use cached memory, and issue "PLD" instructions or program the L2 cache preload engine "PLE" to load the data into the cache just ahead of when you need it. This moves the cache miss off the critical path, so it doesn't slow down the execution of the code your core is running.
      * Alternatively you can put a DMA engine on the ACP port and use the DMA instead of the PLE to pull data into the L2 cache.

    HTH,
    Iso
Children
No data