This discussion has been locked.

You can no longer post new replies to this discussion. If you have a question you can start a new discussion

NEON vdiv.f32 syntax

Note: This was originally posted on 17th April 2012 at http://forums.arm.com

I am (re)coding a 3D math library with inline NEON assembly for iOS using the Apple LLVM compiler 3.1.

I get an error message on the following instruction:

[color="#000000"]    "vdiv.f32 q0, q1, q2 \n\t" [/color]

VFP single or double precision register expected -- `vdiv.f32 q0,q1,q2'

According to the 'Assembler Reference' page 4-76 you should specify a single precision register. The following code works:

[color="#000000"]    "vdiv.f32 s0, s4, s8 \n\t" [/color]
"vdiv.f32 s1, s5, s9 \n\t"
"vdiv.f32 s2, s6, s10 \n\t"

I am confused because now the divide is not computed in parallel, which was the reason to use inline assembly.

Also the following instructions work as expected:

[color="#000000"]    // component wise add[/color]
[color="#000000"]    "vadd.f32 q0, q1, q2 \n\t" [/color]

[color="#008311"] // component wise subtract
[color="#000000"]    "vsub.f32 q0, q1, q2 \n\t" [/color]

[color="#008311"] // component wise multiply
[color="#000000"]    "vmul.f32 q0, q1, q2 \n\t" [/color][/color][/color]
[color="#000000"]
[color="#ce2f24"][color="#000000"]Why do I get an error message on the vdiv and not on the vadd, vsub and vmul? Is this a compiler error?[/color]

[/color][/color]

Parents

Carl van Heezik over 12 years ago

Note: This was originally posted on 17th April 2012 at http://forums.arm.com

Do you mean that these instructions don't exist:
VDIV.f32 d0, d1, d2 // NEON 2 float operation
VDIV.f32 q0, q1, q2 // NEON 4 float operation

And these instruction exist:

VADD.f32 d0, d1, d2 // NEON 2 float operation
VADD.f32 q0, q1, q2 // NEON 4 float operation

Why is it not mentioned in the documentation? Does the divider use to much space on chip?

So you have to trade speed for accuracy?
Cancel
Vote up 0 Vote down

Cancel

Reply

Carl van Heezik over 12 years ago

Note: This was originally posted on 17th April 2012 at http://forums.arm.com

Do you mean that these instructions don't exist:
VDIV.f32 d0, d1, d2 // NEON 2 float operation
VDIV.f32 q0, q1, q2 // NEON 4 float operation

And these instruction exist:

VADD.f32 d0, d1, d2 // NEON 2 float operation
VADD.f32 q0, q1, q2 // NEON 4 float operation

Why is it not mentioned in the documentation? Does the divider use to much space on chip?

So you have to trade speed for accuracy?
Cancel
Vote up 0 Vote down

Cancel

Children

No data