Hello guys,I've been running the same code (that you can find here https://gist.github.com/poz1/1714ddd68da5816624d6867ad6cc5d98 ) on an R5 Board and an A9 Board.Optimisations are enabled and my goal was to find the "right clock" for the A9 in order to obtain the same performances of the R5.I know they are conceptually different but I was expecting to find that the A9 (at 650Mhz) to be faster than the R5 (at 500Mhz).
Instead the outputs I got are:
- R5
Starting computationOutput took 9879627907556208991 clock cycles.Output took 32976237.10 us.
- A9
Starting computationOutput took 36834640184 clock cycles.Output took 56668677.21 us.
I am puzzled because the R5 uses much more clock cycles but takes half the time (???) to complete.Do you have any idea of how could be explained?Thank you :)
Yup, 16 bit, It says
EDIT: Actually, being this board FPGA, I have the option to lower the bus of R5 to 16bits, I am just unsure if it would kill the board or not :D
If baremetall try to make it run in the OCM which is 256K and should be sufficient for code and data.
But the 16bit bus seems to me an explanation why you see such a big difference between CA9/R5.
Hi,
Has the Cortex-A9 had the MMU set up and the caches enabled before running this code (from Normal memory) ???
There's been no mention of this and I have encountered this happening before....
Just a thought.
regards
Stuart
Hello, thanks for all the precious ideas :) Unfortunately I had to put the project on hold until mid february :( I will surelly test and let you know :) Thanks again,Alessandro