Arm Community
Site
Search
User
Site
Search
User
Groups
Arm Research
DesignStart
Education Hub
Graphics and Gaming
High Performance Computing
Innovation
Multimedia
Open Source Software and Platforms
Physical
Processors
Security
System
Software Tools
TrustZone for Armv8-M
中文社区
Blog
Announcements
Artificial Intelligence
Automotive
Healthcare
HPC
Infrastructure
Innovation
Internet of Things
Machine Learning
Mobile
Smart Homes
Wearables
Forums
All developer forums
IP Product forums
Tool & Software forums
Pelion IoT Platform
Support
Open a support case
Documentation
Downloads
Training
Arm Approved program
Arm Design Reviews
Community Help
More
Cancel
Developer Community
IP Products
Processors
Jump...
Cancel
Processors
Cortex-A / A-Profile forum
Cortex A8 Instruction Cycle Timing
Blogs
Forums
Videos & Files
Help
Jump...
Cancel
New
Replies
90 replies
Subscribers
275 subscribers
Views
64667 views
Users
0 members are here
Cortex-A
Related
Cortex A8 Instruction Cycle Timing
Offline
barney vardanyan
over 7 years ago
Note: This was originally posted on 17th March 2011 at
http://forums.arm.com
Hi) sorry for bad English
I need to count latency for two instruction, and all I have is the arm cortex A 8 documantation(charter 16) !
but I have no idea how can do this work using that documantation(
Parents
Offline
Etienne SOBOLE
over 7 years ago
Note: This was originally posted on 29th April 2011 at
http://forums.arm.com
for dual rules, all is here
http://infocenter.ar...k/Babhefaj.html
For the functional unit:
Once the instruction have been decode, it is seended to a specific functional unit (called fu).
Those "fu" are linked to pipelines.
On the ARM, you have 2 pipelines and 4 fu
ALU0
MUL0
ALU1
LS (load store)
ALU0 and MUL0 a linked to pipeline 0
ALU1 is linked to pipeline 1
There is no MUL1. That's why you can't execute a MUL opération into pipeline 1
LS fu is linked to pipeline0 and pipeline1. That's why you execute only one memory access, but this acces can be done into pipeline 0 ou pipeline 1.
On ARM you can only execute 1 MUL by cycle and 1 LDR / STR by cycle. But why these instructions can't be dual is not the same !!!
Do you need to handle then in a cycle counter ???
Rem : what I'll say now is not very sure ! These are only speculations (but they seems to be true) !!!
Let suppose you have this code
LDRD r0, r1, [r5]!
LDR r3, [r6]!
LDRD take 2 cycles. (and start on cycle 1 pipeline 0)
Because it is a multicyle instruction, only the last cycle can be dual.
LDR can be executed into pipeline 1
So! If you just apply the ARM rules described into the link I gave you sooner. The LDR should execute into cycle 2 pipeline 1.
For me, this is not possible because the LS unit is in use (it is in use for 2 cycles). So LDR will execute in cycle 3 pipeline 0.
This working mode seems to be correct, but there is not many case where stall cycle are due to fu conflict !
In fact I think the rules should be
"Multi-cycle instructions must issue in pipeline 0 and can only dual issue in their last iteration
if it does not use the same functional unit
."
Cancel
Up
0
Down
Reply
Cancel
Reply
Offline
Etienne SOBOLE
over 7 years ago
Note: This was originally posted on 29th April 2011 at
http://forums.arm.com
for dual rules, all is here
http://infocenter.ar...k/Babhefaj.html
For the functional unit:
Once the instruction have been decode, it is seended to a specific functional unit (called fu).
Those "fu" are linked to pipelines.
On the ARM, you have 2 pipelines and 4 fu
ALU0
MUL0
ALU1
LS (load store)
ALU0 and MUL0 a linked to pipeline 0
ALU1 is linked to pipeline 1
There is no MUL1. That's why you can't execute a MUL opération into pipeline 1
LS fu is linked to pipeline0 and pipeline1. That's why you execute only one memory access, but this acces can be done into pipeline 0 ou pipeline 1.
On ARM you can only execute 1 MUL by cycle and 1 LDR / STR by cycle. But why these instructions can't be dual is not the same !!!
Do you need to handle then in a cycle counter ???
Rem : what I'll say now is not very sure ! These are only speculations (but they seems to be true) !!!
Let suppose you have this code
LDRD r0, r1, [r5]!
LDR r3, [r6]!
LDRD take 2 cycles. (and start on cycle 1 pipeline 0)
Because it is a multicyle instruction, only the last cycle can be dual.
LDR can be executed into pipeline 1
So! If you just apply the ARM rules described into the link I gave you sooner. The LDR should execute into cycle 2 pipeline 1.
For me, this is not possible because the LS unit is in use (it is in use for 2 cycles). So LDR will execute in cycle 3 pipeline 0.
This working mode seems to be correct, but there is not many case where stall cycle are due to fu conflict !
In fact I think the rules should be
"Multi-cycle instructions must issue in pipeline 0 and can only dual issue in their last iteration
if it does not use the same functional unit
."
Cancel
Up
0
Down
Reply
Cancel
Children
No data
More questions in this forum
By title
By date
By reply count
By view count
By most asked
By votes
By quality
Descending
Ascending
All recent questions
Unread questions
Questions you've participated in
Questions you've asked
Unanswered questions
Answered questions
Questions with suggested answers
Questions with no replies
Not Answered
Forum FAQs
0
ARM Community
712
views
0
replies
Started
5 days ago
by
Annie Cracknell
Not Answered
OSv guest encountering EC - "Unknown Reason" sync exception (ESR = 0x2000000) on Raspberry PI 4B host with KVM on
0
Raspberry Pi
Cortex-A72
Emulation & Virtualization
1912
views
5
replies
Latest
23 hours ago
by
Waldek
Not Answered
Data abort exception for unaligned access in Cortex A55
0
351
views
1
reply
Latest
2 days ago
by
Annie Cracknell
Not Answered
How To Swap From 32-bit Mode To 64-bit Mode In An Android that has ARMV8-A OS
0
18980
views
4
replies
Latest
7 days ago
by
42Bastian Schick
Answered
Which register excactly control the endiness in the EL0 data access? SPSR_EL1 or SCTLR_EL1?
0
AArch64
Armv8-A
AArch32
2091
views
2
replies
Latest
9 days ago
by
George_
>
View all questions in Cortex-A / A-Profile forum