Hello everyone,
I developed a bare metal system for Cortex-A15 Versatile Express board. I used Fast Models to run/debug the system and now I want to perform some performance test to measure latencies by counting elapsed cycles with the PMU. I noticed that ARM Fast Models uses a fixed CPI=1 for all instructions and that Timing annotation is not supported for the FVP_VE Models. Is this true?
Since I do not have the possibility to test the System in a real Hardware, are there other ways to accurately count elapsed cycles?
Thanks a lot in advance.
Any kind of help will be very appreciated.
Giò
Hello,
It is possible to change the CPI values for Cortex-A15 when running on Fast Models, but it still may not be enough detail for you. The other possibility is ARM Cycle Models. Cycle Models have instrumentation of all PMU events and tools to capture all of the performance data with no programming on your part. The applicability of Cycle Models depends most on how many cycles you need to run as Cycle Model performance is in the thousands of instructions/sec vs millions of instructions/sec for Fast Modes, but since you mention bare metal software it may be a good fit. https://developer.arm.com/products/system-design/cycle-models
You can review an article on the subject to see if this is a possibility. https://community.arm.com/soc/b/blog/posts/system-performance-analysis-and-the-arm-performance-monitor-unit-pmu
If your cycle count is large, then swap & play is also a solution. You can run Fast Models, take a checkpoint, and load it into Cycle Models. This article is more about Linux vs. bare metal, but it will give you the idea of how it works. https://community.arm.com/soc/b/blog/posts/three-tips-for-using-linux-swap-play-with-arm-cortex-a-systems
Please feel free to contact me directly and I can help you decide the best modeling solution for your needs or give you more info on the pros and cons of timing annotation for the Cortex-A15.
Regards,
Jason