Arm Community
Site
Search
User
Site
Search
User
Support forums
Arm Development Studio forum
Pipeline Stage Read and Write
Jump...
Cancel
Locked
Locked
Replies
8 replies
Subscribers
119 subscribers
Views
4876 views
Users
0 members are here
Options
Share
More actions
Cancel
Related
How was your experience today?
This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion
Pipeline Stage Read and Write
Etienne SOBOLE
over 12 years ago
Note: This was originally posted on 5th March 2011 at
http://forums.arm.com
I'm still trying to understand the cycle table of the cortex A8.
Most of the test I've made suppose this:
- source register are needed at the beginning of the stage
- destination register are released at the end of the stage.
With those rules it's seem's that result are quite good.
For example, this code take 3 cycles
add r5, r5, #1
mov r6, r5
because ADD release r5 on the end of stage 2 while MOV need it at the beginning of stage 1
That 's work and that's real cycle execution timing.
But I've a problem with the MLA shortcuts
mul r4, r5, r4
mla r0, r6, r7, r4
the MUL should release R4 at the end of stage 5 (of the second cycle of the MUL)
the MLA need r4 at the beginning of the stage 4 (due to MLA shortcut).
So the code should take 5 cycles, but in fact It takes only 4 cycles.
Is it possible that r4 is only needed at the beginning of the stage 4 of the second cycle of the MLA ???
Or may be the forwarding is done at the end of the stage 4. So I could suppose this is the same thing as the beginning of the stage 5 !
That could explain the missing cycle.
Parents
Ruben Buchatskiy
over 12 years ago
Note: This was originally posted on 23rd March 2011 at
http://forums.arm.com
A multiply that is followed by a MAC with a dependency on the accumulator, Rn register, triggers a special accumulator
forwarding. This enables both instructions to issue back-to-back because Rn is required as a source in E4. If this accumulator
forwarding is not used, Rn is required in E2.
Cancel
Vote up
0
Vote down
Cancel
Reply
Ruben Buchatskiy
over 12 years ago
Note: This was originally posted on 23rd March 2011 at
http://forums.arm.com
A multiply that is followed by a MAC with a dependency on the accumulator, Rn register, triggers a special accumulator
forwarding. This enables both instructions to issue back-to-back because Rn is required as a source in E4. If this accumulator
forwarding is not used, Rn is required in E2.
Cancel
Vote up
0
Vote down
Cancel
Children
No data