I need to benchmark some C++ code on the FVP_MPS2_Cortex-M4 simulator.
I considered using CMSIS function osKernelGetTickCount() to provide timestamps but the resolution of the tick timer seems to be 1ms, which is too coarse.
What would be a suitable clock counter to use for the timings?
Would ARM_CM_DWT_CYCCNT be suitable and, if so, how would I access it?Best regards
David
Hi Toshi and RonanI'm still struggling with obtaining meaningful timing info from CM_DWT_CYCCNT. I wrote a simple loop and timed it using the start_cyccnt() and stop_cyccnt() functions that Ronan suggested above. Here are the code and the results:
As you will see, the values seem increasingly meaningful as the total loop count increases, but the numbers are meaningless for a loop count of 75 or lower. Also, given that this is a cycle count, i.e. at least one cycle per instruction, a count of 1040 must be wrong for a loop count of 1000.
I wonder if the simulator is working correctly?
Do you have any thoughts on this, or any suggestions for an alternative timing method, please? (RTOS tick would be too coarse).Best regards
Hi David, this is not totally unexpected, and goes to the same issue that Toshi said earlier, that Fast Models are not 100% cycle accurate.
I think what you are seeing here is an affect of the 'quantum' of instructions that Fast Models use to accelerate execution. If you single step though the loop with low cycle count, do you get different numbers?
Hi David,
FVPs and Fast Models (FM) sacrifice accuracy to get fast simulation speed so the result you see is expected. As I previously posted, users should not expect cycle accuracy from them. Please check what the models cannot in the link that I put in my previous post (see below in case you missed it).
https://developer.arm.com/documentation/100964/1116/Introduction-to-the-Fast-Models-Reference-Manual/Model-capabilities
--
Fast Models can:
Fast Models cannot:
You can increase accuracy with a smaller number of quantum (-Q) or minimum sync latency (-M) listed in Table 4 Timing and performance options of the FVP reference guide below, however, please note that even if you set them to the minimum number i.e. 1, you won't be able to see the expected result because this is how the model is implemented (they are not designed to produce accuracy but to run very fast) so again users should not expect cycle accuracy with FVP/FM.
developer.arm.com/.../FVP-command-line-options
Please note that smaller numbers to these parameters will make the model run slowly. Speed vs Accuracy is always exclusive.
Kind regards,Toshi
Hi Toshi and Ronan
Thanks for your replies. It's pretty clear that FVP and fast models aren't suitable for my purposes. Does ARM offer a cycle accurate simulator for Cortex-M4? I realise it may be slow but we could tolerate that. If so, could you please tell me where I can obtain it?
Best regards
I've started a new thread 'Cycle accurate simulator for Cortex-M4?' as this one seems to have run its course.