This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Vector reinterpret cast ARM Neon

I have a uint32x4_t ARM NEON vector register. I want to "shuffle" this 4 uint32_t with
[vtbx2](infocenter.arm.com/.../index.jsp "Extended table look up intrinsics") and [vext](infocenter.arm.com/.../index.jsp "Vector reinterpret cast operations")

The interface for the table lookup intrinsics expecting `uint8x8_t`. This seems to be possible with a [cast](infocenter.arm.com/.../index.jsp "vector reinterpret cast") especially because the documentation states that the

> "[...] conversions do not change the bit pattern represented by the vector."

I tried it with the following code:

Fullscreen
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
#include <iostream>
#include <arm_neon.h>
#include <bitset>
int main() {
uint32_t* data = new uint32_t[4];
uint32_t* result = new uint32_t[4];
//00 00 0A 0A
data[0] = 2570;
//00 0A 00 0A
data[1] = 655370;
//0A 0A 0A 0A
data[2] = 168430090;
//00 00 00 0A
data[3] = 10;
//load data
uint32x4_t dataVec = vld1q_u32(data);
//cast to uint8
uint8x16_t dataVecByteTmp = vreinterpretq_u8_u32(dataVec);
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

One can compile it with:

> g++ -march=native -mfpu=neon -std=c++14 main.cpp

The ouput looks like the following:

 

Fullscreen
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
Orig Data:
00000000000000000000101000001010
00000000000010100000000000001010
00001010000010100000101000001010
00000000000000000000000000001010
unsigned output
10
10
0
0
10
0
10
0
10
10
10
10
10
0
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

As one can see, the result is not like expected. Does anyone knows whether I did something wrong or is this just a bug?

Sincerely

0