This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

MCPS analysis for ARM9,ARM7 and cortex-A8

  • Note: This was originally posted on 19th October 2012 at http://forums.arm.com

    I have set the same CPU frequency for all 3 set ups.
  • Note: This was originally posted on 22nd October 2012 at http://forums.arm.com

    I am using the targets that have been shipped with the RVDS4.1 (ARM926EJS, ARM1136JS).
    I have taken care of Memory mangement(using scatter ,init_cache files) , branch prediction(simulator settings)  for ARM11, ARM9 but not for Cortex-A8.
    In cortex-A8 I have only used the branch prediction settings(--cpu).If you have any memory management related files that could help me simulate the cycles results that would be helpful.
  • Note: This was originally posted on 19th October 2012 at http://forums.arm.com

    [color=#222222][font=arial, helvetica, sans-serif][size=2]> Can any one please explain me the reason of this behaviour [/size][/font][/color]
    [color=#222222][font=arial, helvetica, sans-serif][size=2]
    [/size][/font][/color]
    [color=#222222][font=arial, helvetica, sans-serif][size=2]ARM11 is generally faster for compiled code, but it is easy to write assembler for an ARM9 and have it run slower. [/size][/font][/color][color=#222222][font=arial, helvetica, sans-serif][size=2]Without looking at your code it is hard to say exactly why. My guess is that you have some tight loops in your assembler which don't play nicely with the ARM11 branch predictor.[/size][/font][/color]
    [color=#222222][font=arial, helvetica, sans-serif][size=2]
    [/size][/font][/color]
    [color=#222222][font=arial, helvetica, sans-serif][size=2] * Always try and make sure you have two other instructions in between the flag setting operation and the use of the condition in a branch.[/size][/font][/color]
    * Don't branch to a branch instruction, the second one will always fail to predict.
    [color=#222222][font=arial, helvetica, sans-serif][size=2] * ARM11 has a two cycle load-use penalty, so don't use loaded registers on the next instruction or you will get stalls.[/size][/font][/color]

    You also don't say what your benchmarking setup in terms of the three CPU frequencies are. ARM11 will probably be slower than an ARM9 at the same frequency "on average" because the pipeline is longer. However the longer pipeline means it has a significantly higher top clock speed which is where much of the performance comes from. Cortex-A8 is a dual issue machine, so that should be faster at the same frequency.
  • Note: This was originally posted on 21st October 2012 at http://forums.arm.com

    You don't mention what the platforms are, are much about how the tests were run.
    For example, if you're using the models shipped with RVDS these are NOT cycle accurate. So any figures for the ARM11/Cortex-A8 you get will be misleading at best, meaningless at worst.
    How are you setting up memory management?  The more advanced the processor the bigger the impact of not (or incorrectly) setting up memory management. With caches, branch prediction, etc... set up correctly, they might conceivably be worse than the ARM9.