Hi, i have some issue on an inplace vetx.32 instructions. I post it on the Cortex A forum. Who has a tip or workaround as it is too slow on A7,A8,A9, etc... ? thanks
vetx-32-q1-q1-q1-3-slow
Hi,
Which compiler are you using?
Thanks,
Paul.
Hi Paul, gcc 8.3 but i did bench on many ARM (a7,a8,a9,a53,a72) and the results is homogeneous (different also) but seems logical. How ever i am not that much happy to spend so much cycles (5 on A7) for a single line instruction. I am happy if you can ask the team some workaround. cheers. bruno
This is the Arm Compiler forum. Your query would be better answered in the GCC toolchain forum: https://community.arm.com/developer/tools-software/oss-platforms/f/gnu-toolchain-forum
Thanks
Peterson
Hi br-dev,
If you're talking about the latency of the instruction itself then there isn't really an alternative to it,
You could in principle do it with two shifts (left and right) and an or, but that will undoubtedly be more expensive.
Depending on the actual operation you're doing you make be able to use a different sequence but if you're just only talking about vext then I don't believe there is.
Regards,
Tamar
Thanks to All for the answer: Paul, Peterson and Tamar. Your answers contribute to sort it out as per Tamar's comment. I understand the latency of the instructions for in place unfortunately i also do not see how to do this operation another way. Doing a different sequence lead also to the same point on my use case (all roads lead to Rome) . closing the topic then. cheers.