Arm Community
Site
Search
User
Site
Search
User
Support forums
Arm Development Studio forum
[ARM926EJS] improve write miss
Jump...
Cancel
Locked
Locked
Replies
8 replies
Subscribers
119 subscribers
Views
3833 views
Users
0 members are here
Options
Share
More actions
Cancel
Related
How was your experience today?
This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion
[ARM926EJS] improve write miss
stanley shih
over 12 years ago
Note: This was originally posted on 5th October 2010 at
http://forums.arm.com
Hello experts,
The platform I am using is ARM926EJS. Cache policy is write-back and only read-allocate.
From the profile result, the program I want to optimize has too many write misses (write buffer refill)
Can anyone give me some guidelines or tricks to improve my program? thanks.
BR,
Stanley
Parents
Peter Harris
over 12 years ago
Note: This was originally posted on 8th November 2010 at
http://forums.arm.com
What should I do to load the cache line in advance with minimal cost.
You can't on an ARM9. It's a fully in-order core, so if you issue a load to the memory to act as a preload it is still going to block waiting for that "preload" to fill the cache, so you are going to stall just as long, just earlier. For an ARM9 the best you can do is not cause that line to get evicted in the first place, and to minimize the number of lines you need to load.
It's another case where a newer core would help - ARM11 and Cortex-R and A families decouple the load pipeline from the ALU execute, and only interlock when the data which is needed is not yet available. That said this is mostly useful for hiding a few cycles of latency, not for hiding many tens of cycles of cache miss overhead - preload is still a better solution for that.
Cancel
Vote up
0
Vote down
Cancel
Reply
Peter Harris
over 12 years ago
Note: This was originally posted on 8th November 2010 at
http://forums.arm.com
What should I do to load the cache line in advance with minimal cost.
You can't on an ARM9. It's a fully in-order core, so if you issue a load to the memory to act as a preload it is still going to block waiting for that "preload" to fill the cache, so you are going to stall just as long, just earlier. For an ARM9 the best you can do is not cause that line to get evicted in the first place, and to minimize the number of lines you need to load.
It's another case where a newer core would help - ARM11 and Cortex-R and A families decouple the load pipeline from the ALU execute, and only interlock when the data which is needed is not yet available. That said this is mostly useful for hiding a few cycles of latency, not for hiding many tens of cycles of cache miss overhead - preload is still a better solution for that.
Cancel
Vote up
0
Vote down
Cancel
Children
No data