Arm Community
Site
Search
User
Site
Search
User
Groups
Research Collaboration and Enablement
DesignStart
Education Hub
Innovation
Open Source Software and Platforms
Forums
AI and ML forum
Architectures and Processors forum
Arm Development Platforms forum
Arm Development Studio forum
Arm Virtual Hardware forum
Automotive forum
Compilers and Libraries forum
Graphics, Gaming, and VR forum
High Performance Computing (HPC) forum
Infrastructure Solutions forum
Internet of Things (IoT) forum
Keil forum
Morello Forum
Operating Systems forum
SoC Design and Simulation forum
中文社区论区
Blogs
AI and ML blog
Announcements
Architectures and Processors blog
Automotive blog
Graphics, Gaming, and VR blog
High Performance Computing (HPC) blog
Infrastructure Solutions blog
Innovation blog
Internet of Things (IoT) blog
Operating Systems blog
Research Articles
SoC Design and Simulation blog
Tools, Software and IDEs blog
中文社区博客
Support
Arm Support Services
Documentation
Downloads
Training
Arm Approved program
Arm Design Reviews
Community Help
More
Cancel
Support forums
Arm Development Studio forum
Questions about code benchmark difference between A8 and A15
Jump...
Cancel
Locked
Locked
Replies
4 replies
Subscribers
121 subscribers
Views
3116 views
Users
0 members are here
Options
Share
More actions
Cancel
Related
How was your experience today?
This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion
Questions about code benchmark difference between A8 and A15
Offline
Adam Yao
over 9 years ago
Note: This was originally posted on 1st May 2013 at
http://forums.arm.com
[size="3"]Hi all,
I tested some protocol stack code's benchmark in Cortex A8 and A15. The code is same and I enabled L1/L2 cache and branch prediction in both CPUs (I placed all the codes and data in DDR and configurated DDR's MMU attribute to WB&WA for both L1 and L2 cache). Basicly A15's result is bettern than A8, but there is something I can not explain. The resut is as below:
A8 A15
Overall cycles 71821 30057
L1I cache miss 469 431
L1D cache miss 1015 139
L2 cache read miss 78 584
L2 cache write miss 247 26
mis-predicted/no predicted branch 469 217
predicted branch 1015 11385
What confused me are:
[/size](1) Why A15's L1D cache miss number is much lower than A8, both CPU's L1D cache is 32K? So what cause A15's L1D performance improvement?
(2) Why A15's L2 cache read miss number is much higher than A8?
(3) For branch prediction, why the sum of predicted and not predicted branch number is not equal in A8 and A15 and why A15's number is about 10 times A8's number?
Wait for a reasonable explanation.
Thank you.
Parents
Offline
Gilead Kutnick
over 9 years ago
Note: This was originally posted on 6th May 2013 at
http://forums.arm.com
You can find the performance counter numbers in the Cortex-A8 TRM in 3.2.49 (
System Control Coprocessor
>
System control coprocessor registers
> c9, Event Selection Register):
http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0344k/index.html
I agree they don't seem to be the same as Cortex-A15's.
Cancel
Up
0
Down
Cancel
Reply
Offline
Gilead Kutnick
over 9 years ago
Note: This was originally posted on 6th May 2013 at
http://forums.arm.com
You can find the performance counter numbers in the Cortex-A8 TRM in 3.2.49 (
System Control Coprocessor
>
System control coprocessor registers
> c9, Event Selection Register):
http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0344k/index.html
I agree they don't seem to be the same as Cortex-A15's.
Cancel
Up
0
Down
Cancel
Children
No data