This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

R5 vs A9 Performances

Hello guys,
I've been running the same code (that you can find here https://gist.github.com/poz1/1714ddd68da5816624d6867ad6cc5d98 ) on an R5 Board and an A9 Board.
Optimisations are enabled and my goal was to find the "right clock" for the A9 in order to obtain the same performances of the R5.

I know they are conceptually different but I was expecting to find that the A9 (at 650Mhz) to be faster than the R5 (at 500Mhz).

Instead the outputs I got are:

- R5 

Starting computation
Output took 9879627907556208991 clock cycles.
Output took 32976237.10 us.

 - A9 

Starting computation
Output took 36834640184 clock cycles.
Output took 56668677.21 us.

I am puzzled because the R5 uses much more clock cycles but takes half the time (???) to complete.

Do you have any idea of how could be explained?
Thank you :)

Parents
  • Did you hand-stop those times?

    For the cycles you should read the PMU counters.

    The R5 has no access to the A53 timers, so XTime does have a different base.

    I suggest to use a dedicated timer, check its frequency with an GPIO and then use it to measure the time.

    Anyway, my experience throughout all ARM cores is, that small routines just scale with the clock with a slight performance plus for those with a longer pipeline.

Reply
  • Did you hand-stop those times?

    For the cycles you should read the PMU counters.

    The R5 has no access to the A53 timers, so XTime does have a different base.

    I suggest to use a dedicated timer, check its frequency with an GPIO and then use it to measure the time.

    Anyway, my experience throughout all ARM cores is, that small routines just scale with the clock with a slight performance plus for those with a longer pipeline.

Children