This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Reading 8 u8s into 8 u16 lanes in NEON Q register

Note: This was originally posted on 2nd June 2011 at http://forums.arm.com

I'm totally new to SIMD handcoding and the ARM ISA in general, and am having trouble understanding the documentation for the vector load instructions.  I have an array of 8 unsigned 8-bit numbers on the heap, and I would like to load them into a q-register as 16 bit unsigned integers (0 extended), filling all 128 bits.  The 16-bit width is needed because of multiplies that I perform after the load instruction.

I believe I have everything else figured out, but even after reading the docs for the VLDn instruction multiple times, I was unable to determine if this is possible with a single instruction.  I do not care what order the u8s are loaded into the register, provided it's deterministic.
Parents
  • Note: This was originally posted on 2nd June 2011 at http://forums.arm.com

    If you load a 64-bit register and the subsequent multiply is VMULL it will widen the result for you automatically.

    If you can't use VMULL then you can load a 64-bit register with your data, a 64-bit register with a zero constant, and then use VZIP.U8 to interleave them.

    If you need to do this multiple times then use VADDL to add zero to your 64-bit value. This widens but does not clobber the vector of zeros, so you can use them again for subsequent operations.
Reply
  • Note: This was originally posted on 2nd June 2011 at http://forums.arm.com

    If you load a 64-bit register and the subsequent multiply is VMULL it will widen the result for you automatically.

    If you can't use VMULL then you can load a 64-bit register with your data, a 64-bit register with a zero constant, and then use VZIP.U8 to interleave them.

    If you need to do this multiple times then use VADDL to add zero to your 64-bit value. This widens but does not clobber the vector of zeros, so you can use them again for subsequent operations.
Children
No data