Arm Community
Site
Search
User
Site
Search
User
Groups
Arm Research
DesignStart
Education Hub
Graphics and Gaming
High Performance Computing
Innovation
Multimedia
Open Source Software and Platforms
Physical
Processors
Security
System
Software Tools
TrustZone for Armv8-M
中文社区
Blog
Announcements
Artificial Intelligence
Automotive
Healthcare
HPC
Infrastructure
Innovation
Internet of Things
Machine Learning
Mobile
Smart Homes
Wearables
Forums
All developer forums
IP Product forums
Tool & Software forums
Support
Open a support case
Documentation
Downloads
Training
Arm Approved program
Arm Design Reviews
Community Help
More
Cancel
Developer Community
Tools and Software
Software Tools
Jump...
Cancel
Software Tools
Arm Development Studio forum
Any equivalent NEON instruction to SMULWy?
Tools, Software and IDEs blog
Forums
Videos & Files
Help
Jump...
Cancel
New
Replies
2 replies
Subscribers
126 subscribers
Views
1416 views
Users
0 members are here
Related
Any equivalent NEON instruction to SMULWy?
Offline
Jake Lee
over 7 years ago
Note: This was originally posted on 6th July 2013 at http://forums.arm.com
Hi everybody,
I'm currently working on 7x7 gaussian blur filter for NEON.
And since everything bigger than 3x3 is hard to handle with 2D algorithms, I made it to 2*1D algorithm.
Running the filter horizontally first, every pixel (y value) is temporarily stored in 16bit uq8. So far so good. It's super fast with zero latency and dual issue everywhere possible.
The problem begins when doing it vertically.
I know WIDE model isn't available for mul instructions. It's no problem widening the coefficients to 16bit for 16bit*16bit multiplications. With VMULL.u16, the result is then 32bit so I have to do narrowing twice in order to get the final result in 8bit, and I really don't like it.
I read through the assembly reference several times, but there seems to be no mul instruction giving the upper 16bit as the result. Am I right on this?
Do I really have to accept having to do narrowing twice?
I badly need something like SMULWy....
VQDMULH, which gives the upper half as result won't do the trick since it works only with signed values and doubles the result, if I understood correctly.
I'm really curious :
What is VQDMULH good for?
I can hardly imagine anything where that doubling might be useful. Can someone enlighten me?
Thanks in advance
Parents
Offline
Simon Craske
over 7 years ago
Note: This was originally posted on 7th July 2013 at
http://forums.arm.com
VQDMULH performs the "shift-by-Q" required to compute a fixed-point multiply.
Using "0.8 * 0.8 = 0.64" in Q15 format as an example:
The Q15 register value is given by 0.8 * 2^15 = 26214
26214 in hexadecimal = 0x6666
0x6666 * 0x6666 = 0x28F570A4
0x28F570A4 * 2 = 0x51EAE148
Top half of 0x51EAE148 = 0x51EA
0x51EA in decimal = 20970
The interpretation of this Q15 register value is 20970 / 2^15 = 0.64
hth
s.
Cancel
Up
0
Down
Reply
Cancel
Reply
Offline
Simon Craske
over 7 years ago
Note: This was originally posted on 7th July 2013 at
http://forums.arm.com
VQDMULH performs the "shift-by-Q" required to compute a fixed-point multiply.
Using "0.8 * 0.8 = 0.64" in Q15 format as an example:
The Q15 register value is given by 0.8 * 2^15 = 26214
26214 in hexadecimal = 0x6666
0x6666 * 0x6666 = 0x28F570A4
0x28F570A4 * 2 = 0x51EAE148
Top half of 0x51EAE148 = 0x51EA
0x51EA in decimal = 20970
The interpretation of this Q15 register value is 20970 / 2^15 = 0.64
hth
s.
Cancel
Up
0
Down
Reply
Cancel
Children
No data
More questions in this forum
By title
By date
By reply count
By view count
By most asked
By votes
By quality
Descending
Ascending
All recent questions
Unread questions
Questions you've participated in
Questions you've asked
Unanswered questions
Answered questions
Questions with suggested answers
Questions with no replies
Suggested Answer
ARM development studio with ARM Juno r2 board
0
Juno Arm Development Platform
Arm Development Studio
Products
Arm Support
9042
views
2
replies
Latest
5 months ago
by
Ronan Synnott
Answered
"Unable to execute remote query (response code 503) " issue
0
8678
views
1
reply
Latest
5 months ago
by
Ronan Synnott
Answered
Where can I download DS-5 hardware firmware??
+1
8119
views
1
reply
Latest
5 months ago
by
Ronan Synnott
Not Answered
freeRTOS demo DS-5 ERROR(CMD360) when trying to debug
+1
13424
views
12
replies
Latest
6 months ago
by
tolc
Answered
ubuntu - How to uninstall Arm Development studio and all its requirements
0
Arm Development Studio
9069
views
1
reply
Latest
6 months ago
by
Jonathan Simmonds
<
>
View all questions in Arm Development Studio forum