Architectures and Processors forum Guidelines on reducing Cache Miss rate

State Accepted Answer
+1 person also asked this people also asked this
Locked Locked
Replies 2 replies
Subscribers 349 subscribers
Views 10170 views
Users 0 members are here

Options

Related

How was your experience today?

This discussion has been locked.

You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Guidelines on reducing Cache Miss rate

techguyz over 11 years ago

Hi Experts,

Is there any document on general software guidelines in reducing the cache miss rate in the system for ARMV7 architectures ?

If it is more specific to A/R/M then its great..

Top replies

Chris Shore over 11 years ago +1 verified

A quick google on "reduce cache miss rate" turns up this page: Reducing Cache Miss Rate which is quite useful. Most recent ARM cores support prefetch instructions like PLD, PLI etc. These can improve the...

Parents

+1 Chris Shore over 11 years ago

A quick google on "reduce cache miss rate" turns up this page: Reducing Cache Miss Rate which is quite useful.
Most recent ARM cores support prefetch instructions like PLD, PLI etc. These can improve the performance of loops, especially through data which exhibits low locallity.
Very often, though, the most important thing you can do is look at your algorithm. And then look at your implementation of that algorithm. A zero-copy algorithm can make better use of available cache space, for instance. And a simplistic implementation of a matrix multiplication operation will often show very poor performance, especially for matrices which are large compared to the cache size, because of a high level of contention. Re-implementing it using strips or blocks/tiles can improve performance by increasing the amount of reuse of cached data.
Think also about data structures. Sparse arrays and linked lists often cache extremely poorly and actually the effect of caches in these cases if sometimes to actually increase memory traffic rather than reduce it.
Another good technique is to align data structures with cache line boundaries. If an individual data element fits within a cache line then the whole element can be loaded with only one cache miss.
There is a lot of other useful material out there.
Hope this helps.
Chris
Cancel
Vote up +1 Vote down

Cancel

Reply

+1 Chris Shore over 11 years ago

A quick google on "reduce cache miss rate" turns up this page: Reducing Cache Miss Rate which is quite useful.
Most recent ARM cores support prefetch instructions like PLD, PLI etc. These can improve the performance of loops, especially through data which exhibits low locallity.
Very often, though, the most important thing you can do is look at your algorithm. And then look at your implementation of that algorithm. A zero-copy algorithm can make better use of available cache space, for instance. And a simplistic implementation of a matrix multiplication operation will often show very poor performance, especially for matrices which are large compared to the cache size, because of a high level of contention. Re-implementing it using strips or blocks/tiles can improve performance by increasing the amount of reuse of cached data.
Think also about data structures. Sparse arrays and linked lists often cache extremely poorly and actually the effect of caches in these cases if sometimes to actually increase memory traffic rather than reduce it.
Another good technique is to align data structures with cache line boundaries. If an individual data element fits within a cache line then the whole element can be loaded with only one cache miss.
There is a lot of other useful material out there.
Hope this helps.
Chris
Cancel
Vote up +1 Vote down

Cancel

Children

No data