This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

How does the NEON access Memory?

Note: This was originally posted on 5th May 2008 at http://forums.arm.com

I have a question about how to get the maximum calculation capability of NEON. In our video processing application, we should access several frame video. Then if the video is HD resolution (1920*1080), the memory size of each frame is more than 6M(1920*1080*3*). So it's impossible to store the total frame in cache. Then we will meet cache miss. I don't know what measure you will take to avoid cache miss.
I will share our experience on this topic when we implement our video processing algorithm on Cell processor and Equator's BSP processor. In Equator's BSP processor, there is DMA measure that can move data between cache(I don't know the details, maybe it's TCM) and memory. So we can set double buffer (for example "ping pong" buffer) in cache to avoid cache miss - when the CPU works on "ping" buffer, we can set the DMA to move data between "pong" buffer and memory, then the time for DMA transfer will be overlapped with the time of CPU's computation, and when the CPU finishes the processing on "ping" buffer and want to process "pong" buffer, it won't meet cache miss.
In Cell processor, the Synergistic Processing Unit (SPU) does not have cache instead of a high speed memory (local store, not more than 256K, include Data, instruction and stack). The local store can be access by a DMA, and this DMA can move data between local store and main memory. Then we also can design a double buffer to move data to one buffer when the SPU is processing the data in other buffer.
My question is that whether there is also the similar DMA in NEON to deal with data movement to avoid cache miss. It's very important for our application, because for video processing, we should access abundant data. And how does the NEON synchronize with the ARM. I have not found the answer in the ARM architecture reference manual. I think that if I can get a simple sample about how to use NEON, I will have some sense about my puzzle.

Thanks!
0