This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

ARM1176JZ-S, cache confg: effective cache size calculation

Note: This was originally posted on 22nd February 2009 at http://forums.arm.com

Hello,

1) I am using ARM1176JZ-S core with WinCE Platform. The cache memory is configured as follows

    DCache: 128 sets, 4 ways, 32 line size, 16384 size
    ICache: 128 sets, 4 ways, 32 line size, 16384 size

    Now I want to know the effective data cache size, I mean the total data from the main memory 
    could be cached and accessed without cache trashing within a function.

2) Is the cache set size(128 sets) and cache block/segment(of other processors) size are same?

Kindly reply this mail, thanks in advance

Regards,
Deven
Parents
  • Note: This was originally posted on 12th March 2009 at http://forums.arm.com

    If the cache is in write-back mode and the line is dirty then the cache line has to be written to memory before the line can be reloaded with new data. If the line is not dirty (either because it is empty, the data is read-only, or is has been flushed) then you can skip this step and just load the new data.

    How long the write-back stage takes depends on the microarchitecture of the processor - some designs block the cache line, whilst others have dedicated victim buffers for storing the lines to be evicted, which frees the line up to for new data much earlier.

    > The "algorithm execution" you're talking about is (assuming again) a rather long loop, therefore I would not expect you to notice the effect of 'the very first memory access'...

    It depends on the data structures. If you have data that is nicely packed (and accessed) in cache-line size-chunks then you probably won't see much effect. If you have data structures which access one byte per cache line, and pollute a lot of the cache, before accessing the next byte in each cache line then you are likely to see a huge performance loss.

    Designing data structures to "be nice" to the cache is one of the most beneficial aspects of optimization - but worrying about individual cache line evictions is unlikely to be useful because of the pseudo-random nature of the typical cache policy).
Reply
  • Note: This was originally posted on 12th March 2009 at http://forums.arm.com

    If the cache is in write-back mode and the line is dirty then the cache line has to be written to memory before the line can be reloaded with new data. If the line is not dirty (either because it is empty, the data is read-only, or is has been flushed) then you can skip this step and just load the new data.

    How long the write-back stage takes depends on the microarchitecture of the processor - some designs block the cache line, whilst others have dedicated victim buffers for storing the lines to be evicted, which frees the line up to for new data much earlier.

    > The "algorithm execution" you're talking about is (assuming again) a rather long loop, therefore I would not expect you to notice the effect of 'the very first memory access'...

    It depends on the data structures. If you have data that is nicely packed (and accessed) in cache-line size-chunks then you probably won't see much effect. If you have data structures which access one byte per cache line, and pollute a lot of the cache, before accessing the next byte in each cache line then you are likely to see a huge performance loss.

    Designing data structures to "be nice" to the cache is one of the most beneficial aspects of optimization - but worrying about individual cache line evictions is unlikely to be useful because of the pseudo-random nature of the typical cache policy).
Children
No data