Is there a possibility to reinterpret cast an uint32x4_t into a uint8x8x2_t using intrinsics?
Sincerely
DorJo,
As far as I am aware, there are no reinterprets that yield an NxMx2 vector. The following code does what I believe you are trying to achieve by going via a pair of uint8x8_t, and with ARMCC and ARMCLANG produces no intermediate moves, i.e. the output is just "VLD1.32 {d16,d17},[r1]", "VST2.8 {d16,d17},[r0]".
#include <arm_neon.h> void func(uint8_t *dst, const uint32_t *src) { uint32x4_t in; uint8x8x2_t out; in = vld1q_u32(src); out.val[0] = vget_low_u8(vreinterpretq_u8_u32(in)); out.val[1] = vget_high_u8(vreinterpretq_u8_u32(in)); vst2_u8(dst, out); }
Best regards
Simon.