This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Cortex A8 Instruction Cycle Timing

Note: This was originally posted on 17th March 2011 at http://forums.arm.com

Hi) sorry for bad English

I need to count latency for two instruction, and all I have is the arm cortex A 8 documantation(charter 16) !
but I have no idea how can do this work using that documantation(
Parents
  • Note: This was originally posted on 11th May 2011 at http://forums.arm.com


    I am confused.
    From the specs, ADD needs source registers at E2 and destination register is available at E2 too. So this 2 instructions can be dual issued:
    add r1, r2, r3
    add r4, r5, r1
    Because the second ADD requires r1 at E2 and the first ADD makes r1 available at E2 too.
    If ADD needs source registers at E1, I agree that 2 instructions above can't be dual issued.
    One explanation, I think, is that ADD needs source registers at the beginning of E2 and make destination register available at the end of E2. However, why doesn't specs say that destination register is available at E3?
    I know I 'm wrong, but I can't explain.




    You're absolutely right !
    It's a strange convention used by ARM !
    It must have a good reason, but I do not know it !!!
    - Maybe it could be quite confusing to say that a MUL result cycle is 7 while the functional unit have only 6 stage !!!
    - Maybe it can happen something between the end of the cycle and before the beginning of the next one. I speak about shortcuts. MUL shortcuts are one cycle faster than indicated (or understood) in the documentation. It's possible that the forward cycle is executed before the beginning of the cycle ! that could explain this difference.

    But finally, This is not really a problem once you understood how to read the cycle table.
Reply
  • Note: This was originally posted on 11th May 2011 at http://forums.arm.com


    I am confused.
    From the specs, ADD needs source registers at E2 and destination register is available at E2 too. So this 2 instructions can be dual issued:
    add r1, r2, r3
    add r4, r5, r1
    Because the second ADD requires r1 at E2 and the first ADD makes r1 available at E2 too.
    If ADD needs source registers at E1, I agree that 2 instructions above can't be dual issued.
    One explanation, I think, is that ADD needs source registers at the beginning of E2 and make destination register available at the end of E2. However, why doesn't specs say that destination register is available at E3?
    I know I 'm wrong, but I can't explain.




    You're absolutely right !
    It's a strange convention used by ARM !
    It must have a good reason, but I do not know it !!!
    - Maybe it could be quite confusing to say that a MUL result cycle is 7 while the functional unit have only 6 stage !!!
    - Maybe it can happen something between the end of the cycle and before the beginning of the next one. I speak about shortcuts. MUL shortcuts are one cycle faster than indicated (or understood) in the documentation. It's possible that the forward cycle is executed before the beginning of the cycle ! that could explain this difference.

    But finally, This is not really a problem once you understood how to read the cycle table.
Children
No data