This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Neon instruction timing/latency

Note: This was originally posted on 7th July 2010 at http://forums.arm.com

Hello!

I am having trouble deciphering the tables in the Cortex-A8 technical reference manual that contains the NEON advanced SIMD instruction timings. There is no explanation anywhere of what the different N values mean. I suspect that they are different steps in the pipeline, but since I have as of yet not been able to find any info on the NEON pipeline, they don't tell me anything.

What I would really like to see is the information that was available in the ARM1136 reference manual, specifically which registers are needed as early/late registers, result latency and so on. It is probably possible to use the supplied N-values to get something similar, but I havent managed yet.

There is clearly some latency in the NEON instructions since I can gain quite  a bit of performance by rearranging the instructions, but I would like to be able to do this in a more scientific manner where I can actually determine beforehand if I would gain anything by rearranging and not like now where I simply try to place instructions depending on each other as far apart as possible.

Best regards,
//Leo
0