Arm Community
Site
Search
User
Site
Search
User
Groups
Arm Research
DesignStart
Education Hub
Graphics and Gaming
High Performance Computing
Innovation
Multimedia
Open Source Software and Platforms
Physical
Processors
Security
System
Software Tools
TrustZone for Armv8-M
中文社区
Blog
Announcements
Artificial Intelligence
Automotive
Healthcare
HPC
Infrastructure
Innovation
Internet of Things
Machine Learning
Mobile
Smart Homes
Wearables
Forums
All developer forums
IP Product forums
Tool & Software forums
Support
Open a support case
Documentation
Downloads
Training
Arm Approved program
Arm Design Reviews
Community Help
More
Cancel
Developer Community
IP Products
Processors
Jump...
Cancel
Processors
Cortex-M / M-Profile forum
How does the NEON access Memory?
Blogs
Forums
Videos & Files
Help
Jump...
Cancel
New
State
Not Answered
Replies
3 replies
Subscribers
6 subscribers
Views
5129 views
Users
0 members are here
Architecture
NEON
simd
Memory
Related
How does the NEON access Memory?
Offline
bearfish
over 7 years ago
Note: This was originally posted on 5th May 2008 at
http://forums.arm.com
I have a question about how to get the maximum calculation capability of NEON. In our video processing application, we should access several frame video. Then if the video is HD resolution (1920*1080), the memory size of each frame is more than 6M(1920*1080*3*). So it's impossible to store the total frame in cache. Then we will meet cache miss. I don't know what measure you will take to avoid cache miss.
I will share our experience on this topic when we implement our video processing algorithm on Cell processor and Equator's BSP processor. In Equator's BSP processor, there is DMA measure that can move data between cache(I don't know the details, maybe it's TCM) and memory. So we can set double buffer (for example "ping pong" buffer) in cache to avoid cache miss - when the CPU works on "ping" buffer, we can set the DMA to move data between "pong" buffer and memory, then the time for DMA transfer will be overlapped with the time of CPU's computation, and when the CPU finishes the processing on "ping" buffer and want to process "pong" buffer, it won't meet cache miss.
In Cell processor, the Synergistic Processing Unit (SPU) does not have cache instead of a high speed memory (local store, not more than 256K, include Data, instruction and stack). The local store can be access by a DMA, and this DMA can move data between local store and main memory. Then we also can design a double buffer to move data to one buffer when the SPU is processing the data in other buffer.
My question is that whether there is also the similar DMA in NEON to deal with data movement to avoid cache miss. It's very important for our application, because for video processing, we should access abundant data. And how does the NEON synchronize with the ARM. I have not found the answer in the ARM architecture reference manual. I think that if I can get a simple sample about how to use NEON, I will have some sense about my puzzle.
Thanks!
More questions in this forum
By title
By date
By reply count
By view count
By most asked
By votes
By quality
Descending
Ascending
All recent questions
Unread questions
Questions you've participated in
Questions you've asked
Unanswered questions
Answered questions
Questions with suggested answers
Questions with no replies
Not Answered
Compiling libgcc not optimized
0
32-bit
Armv7-M
Compiling
GCC
Thumb
Cortex-M
Thumb2
Library
Arm Assembly Language (ASM)
C
Cortex-M4
2256
views
11
replies
Latest
3 months ago
by
a.surati
Not Answered
How to specify RAM location ?
0
SRAM
STM32 F1
Arm Assembly Language (ASM)
643
views
1
reply
Latest
3 months ago
by
GuillaumeP
Answered
Is it possible to enable or disable the nested interrupt mechanism on M0 ?
0
678
views
2
replies
Latest
3 months ago
by
Robert McNamara
Answered
How long are the Cortex-M7 pipeline stages?
0
Cortex-M7
Cortex-M
30847
views
18
replies
Latest
3 months ago
by
Pacocha
Not Answered
How to transfer and read weights, biases and activation functions from trained tensorflow model to nucleo-f446re or any microcontroller in keil
0
Embedded
Neural Network
Keil
TensorFlow
CMSIS
400
views
0
replies
Started
3 months ago
by
PrashanthPoobalan
<
>
View all questions in Cortex-M / M-Profile forum