Hi team
Basically, I am inferencing CNN model using ARMNN framework and captured profiling on streamline.
In below example running model for 10 iterations on HW with armcore(clock frequency 1.8GHz)
The runtime showing for inferencing the model in second say for example in 2nd frame it 4.02ms.
But the corresponding cycles in cortexA78AE is 712 kilo cycles and it will not match with runtime of 4ms.
If the workload was saturating the core so that it was running all of the time at a fixed frequency then you would expect the cycle counter to be a flat line for the entire duration of the 4ms. The fact you're not seeing that for any of the counters implies that the core is going idle for a substantial period of the 4ms runtime.
712K cycles in 4.5 ms is only using ~250Mhz of CPU capacity, so I would check the software thread scheduling to work out what is happening here. You should be able to see that in the "Heat Map view" (selected from the dropdown where you have selected the "Arm NN timeline").
The other one to check is whether your CPU frequency is really fixed, or using DVFS to adjust to the workload.