Please note: We are aware of an issue affecting replies on the Arm Community forums, which may not be loading as expected.
We apologize for any inconvenience and appreciate your patience while we investigate and work to resolve the issue.
Thank you for your understanding.
I'm looking for a tool to iterate faster on my ARM NEON optimizations all through software, i.e without using any hardware / dev boards. I came across ARM Development studio and its Fixed Vritual Platforms (FVPs). I am not very particular on cycle count accuracy when compared to real hardware. As long as i can get consistent cycle count numbers on multiple runs of the simulation, it will be sufficient for me to optimize my code better.
It would be good if i can select a Cortex A series processor (say A53 for now), and some memory model for the DRAM to go with it.
PS - there is also a PMU_AArch64 example provided with Arm Development Studio which you may find useful.
Thanks Ronan. This is really useful.