This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Port x86_64 Intrinsics to ARM64 equivalent

I am new to ARM64 assembly and intrinsics. I have a small routine that uses SSE4.1 x86_64 intrinsics for a vector dot product. I am trying to (as close as possible) replace the x86_64 intrinsics with ARM64 intrinsics. I believe with the ARM64 I will be using single precision rather than double precision and there will be a slightly different results. However, I am trying to get as close as possible. I do have access to arm neon. Intrinsic instructions or asm would work. I am currently stuck. Thanks.

ARM

    float32x4_t a, b;

    __m128 a, b;

ARM

    ????

 

    result = _mm_dp_pd(a, b, mask);