This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Is there an intrinsic to store 3 float values?

I have the following code in assembler:

    vst1.32            {d10}, [%[pOutVertex2]]          
   fsts               s22, [%[pOutVertex2], #8]             

This stores s20, s21, s22 into pOutVertex which is an array of 3 floats. Is there an intrinsic to do this? I can only find vst1q_f32, but that would overwrite the 4th value in pOutVertex.

Parents
  • Things which are not a power of two or a full register are a pain in NEON, but if you are willing to sacrifice a little storage space the obvious data layout change would be to allocate vec4() inputs and outputs. The final increment then becomes a vec4 load of pOutVertex, a vec4 addition, and a vec4 store to write the incremented value of pOutVertex.

Reply
  • Things which are not a power of two or a full register are a pain in NEON, but if you are willing to sacrifice a little storage space the obvious data layout change would be to allocate vec4() inputs and outputs. The final increment then becomes a vec4 load of pOutVertex, a vec4 addition, and a vec4 store to write the incremented value of pOutVertex.

Children
No data