This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

MCPS analysis for ARM9,ARM7 and cortex-A8

Parents
  • Note: This was originally posted on 19th October 2012 at http://forums.arm.com

    [color=#222222][font=arial, helvetica, sans-serif][size=2]> Can any one please explain me the reason of this behaviour [/size][/font][/color]
    [color=#222222][font=arial, helvetica, sans-serif][size=2]
    [/size][/font][/color]
    [color=#222222][font=arial, helvetica, sans-serif][size=2]ARM11 is generally faster for compiled code, but it is easy to write assembler for an ARM9 and have it run slower. [/size][/font][/color][color=#222222][font=arial, helvetica, sans-serif][size=2]Without looking at your code it is hard to say exactly why. My guess is that you have some tight loops in your assembler which don't play nicely with the ARM11 branch predictor.[/size][/font][/color]
    [color=#222222][font=arial, helvetica, sans-serif][size=2]
    [/size][/font][/color]
    [color=#222222][font=arial, helvetica, sans-serif][size=2] * Always try and make sure you have two other instructions in between the flag setting operation and the use of the condition in a branch.[/size][/font][/color]
    * Don't branch to a branch instruction, the second one will always fail to predict.
    [color=#222222][font=arial, helvetica, sans-serif][size=2] * ARM11 has a two cycle load-use penalty, so don't use loaded registers on the next instruction or you will get stalls.[/size][/font][/color]

    You also don't say what your benchmarking setup in terms of the three CPU frequencies are. ARM11 will probably be slower than an ARM9 at the same frequency "on average" because the pipeline is longer. However the longer pipeline means it has a significantly higher top clock speed which is where much of the performance comes from. Cortex-A8 is a dual issue machine, so that should be faster at the same frequency.
Reply
  • Note: This was originally posted on 19th October 2012 at http://forums.arm.com

    [color=#222222][font=arial, helvetica, sans-serif][size=2]> Can any one please explain me the reason of this behaviour [/size][/font][/color]
    [color=#222222][font=arial, helvetica, sans-serif][size=2]
    [/size][/font][/color]
    [color=#222222][font=arial, helvetica, sans-serif][size=2]ARM11 is generally faster for compiled code, but it is easy to write assembler for an ARM9 and have it run slower. [/size][/font][/color][color=#222222][font=arial, helvetica, sans-serif][size=2]Without looking at your code it is hard to say exactly why. My guess is that you have some tight loops in your assembler which don't play nicely with the ARM11 branch predictor.[/size][/font][/color]
    [color=#222222][font=arial, helvetica, sans-serif][size=2]
    [/size][/font][/color]
    [color=#222222][font=arial, helvetica, sans-serif][size=2] * Always try and make sure you have two other instructions in between the flag setting operation and the use of the condition in a branch.[/size][/font][/color]
    * Don't branch to a branch instruction, the second one will always fail to predict.
    [color=#222222][font=arial, helvetica, sans-serif][size=2] * ARM11 has a two cycle load-use penalty, so don't use loaded registers on the next instruction or you will get stalls.[/size][/font][/color]

    You also don't say what your benchmarking setup in terms of the three CPU frequencies are. ARM11 will probably be slower than an ARM9 at the same frequency "on average" because the pipeline is longer. However the longer pipeline means it has a significantly higher top clock speed which is where much of the performance comes from. Cortex-A8 is a dual issue machine, so that should be faster at the same frequency.
Children
No data