Arm Community
Site
Search
User
Site
Search
User
Support forums
Arm Development Studio forum
Need help in GCC intrinsics for NEON
Jump...
Cancel
Locked
Locked
Replies
3 replies
Subscribers
119 subscribers
Views
4308 views
Users
0 members are here
Options
Share
More actions
Cancel
Related
How was your experience today?
This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion
Need help in GCC intrinsics for NEON
Kiran Kumar
over 12 years ago
Note: This was originally posted on 4th April 2012 at
http://forums.arm.com
Hi All,
Can somebody tell me what are the equivalent GCC and ARM intrinsics for generating the below NEON ASM statements?
vld3.16 {d0,d2,d4},[r0]!
vld3.16 {d1,d3,d5},[r0]!
Thanks,
Kiran
Parents
Peter Harris
over 12 years ago
Note: This was originally posted on 9th April 2012 at
http://forums.arm.com
GCC is the same as RVCT:
uint16x8x3_t vld3q_u16 (const uint16_t *)
Check
http://gcc.gnu.org/o...Intrinsics.html
for the full listing (loads and stores are towards the bottom).
One bit of advice - use objdump to check the disassembly GCC emits for NEON intrinsics. Personally I've never been entirely happy with it - it generates an excessive amount of stack traffic to shuffle things between registers - and the intrinsics are so low level you may as well handle register allocation yourself, write the assembler and get the output code you actually wanted in the first place.
To be fair it is improving a lot in the newer GCC releases, but my personal view is that if you have to spell out instructions using intrinsics one instruction at a time you are basically writing assembler anyway
Iso
Cancel
Vote up
0
Vote down
Cancel
Reply
Peter Harris
over 12 years ago
Note: This was originally posted on 9th April 2012 at
http://forums.arm.com
GCC is the same as RVCT:
uint16x8x3_t vld3q_u16 (const uint16_t *)
Check
http://gcc.gnu.org/o...Intrinsics.html
for the full listing (loads and stores are towards the bottom).
One bit of advice - use objdump to check the disassembly GCC emits for NEON intrinsics. Personally I've never been entirely happy with it - it generates an excessive amount of stack traffic to shuffle things between registers - and the intrinsics are so low level you may as well handle register allocation yourself, write the assembler and get the output code you actually wanted in the first place.
To be fair it is improving a lot in the newer GCC releases, but my personal view is that if you have to spell out instructions using intrinsics one instruction at a time you are basically writing assembler anyway
Iso
Cancel
Vote up
0
Vote down
Cancel
Children
No data