Arm Community
Site
Search
User
Site
Search
User
Support forums
Arm Development Studio forum
Arm NEON not able to understand the cycles?
Jump...
Cancel
Locked
Locked
Replies
9 replies
Subscribers
118 subscribers
Views
6715 views
Users
0 members are here
Options
Share
More actions
Cancel
Related
This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion
Arm NEON not able to understand the cycles?
wolfrum aurum
over 12 years ago
Note: This was originally posted on 25th March 2013 at
http://forums.arm.com
I am working on optimizing the code for FFT algorithm using NEON of ARM. I am running Beagle Board xM as target. I am running my program without any operating system on the board(Running program directly on the board). The board is supposed to be run at 1Ghz, I am not where operating near to that frequency. Currently I am facing difficulties regarding basic understanding of NEON. Anyone please help me with the things.
The following are sample programs I ran. LOOP CODE:
Loop Unrolled code:
The following are the results I ran for different frequencies
[size=2]T [/size]
[font="Arial,"][font="Arial,"]The above does not make any sense, Different cycles per instructions at different frequencies.?[/font][/font]
Parents
wolfrum aurum
over 12 years ago
Note: This was originally posted on 2nd April 2013 at
http://forums.arm.com
Thanks for the inputs I was able to achieve considerable performance till now. I am happy with the performance. Please just "glance" through the code at
https://code.google.com/p/neon-fft/downloads/detail?name=NE10_cfft.neon1.s&can=2&q=
. I don't want you to understand the algorithm. If possible please give any general suggestions for optimization(Like the alignment in instruction which was very useful ).
Cancel
Vote up
0
Vote down
Cancel
Reply
wolfrum aurum
over 12 years ago
Note: This was originally posted on 2nd April 2013 at
http://forums.arm.com
Thanks for the inputs I was able to achieve considerable performance till now. I am happy with the performance. Please just "glance" through the code at
https://code.google.com/p/neon-fft/downloads/detail?name=NE10_cfft.neon1.s&can=2&q=
. I don't want you to understand the algorithm. If possible please give any general suggestions for optimization(Like the alignment in instruction which was very useful ).
Cancel
Vote up
0
Vote down
Cancel
Children
No data