This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Fast duplicate lane

Note: This was originally posted on 17th April 2013 at http://forums.arm.com

hi.
I have a little problem.

I have as input a Dn register with 8 byte.
[a, b, c, d, e, f, g, h]

I'd like to have 2 Dn Register with
[a, a, b, b, c, c, d, d]
and
[e, e, f, f, g, g, h, h]

The purpose is to try to do that with a minimum number of NEON register.
for the moment the best Way I found is something like


vmovl.u8              Qn, Dn                 @ convert byte to half word
vmul.u16              Qn, Qn, Qx    @ Dx contain 8 * 257


I'm looking for a solution not using extra register !

do you nhave any idea ?
thank's
Parents
  • Note: This was originally posted on 22nd April 2013 at http://forums.arm.com

    Ok thank's to everybody.

    I'll try new proposed solution, but it seem's that vzip would be the best solution.
    That was very interesting to se so many differente solution to solve this simple problem !

    @RKSimon : thank's for VSLI, I had never try this instruction ;)
Reply
  • Note: This was originally posted on 22nd April 2013 at http://forums.arm.com

    Ok thank's to everybody.

    I'll try new proposed solution, but it seem's that vzip would be the best solution.
    That was very interesting to se so many differente solution to solve this simple problem !

    @RKSimon : thank's for VSLI, I had never try this instruction ;)
Children
No data