We are running a survey to help us improve the experience for all of our members. If you see the survey appear, please take the time to tell us about your experience if you can.
Hi,
In Cortex-A8's architecture, I'm trying to understand why the I-cache is chosen to be in VIPT form (Virtually Indexed Physically Tagged), while the D-cache is PIPT (Physically Indexed Physically Tagged). I know the advantages and disadvantages of using either VIPT/PIPT, but why not make both caches VIPT, or both PIPT?
Also, I'm trying to understand how VIPT can even work for certain A8 configuration in an OS like Linux, that uses 4KB pages?
For example, the ARM VMSA says, L1 caches have..
- fixed line length of 64 bytes
- support for 16KB or 32KB caches (Let's Pick 32KB.)
- an instruction cache that is virtually indexed, IVIPT
- 4-way set associative cache
So from this, the no.of cache lines would be = 512
Size of a cache line = 64 bytes (lower 6 bit's of address would be an offset within cache line)
As there are 4 ways, so the no.of indexes would be = size of cache / no.of ways = 512 / 4 = 128 (index will be 7-bit)
The rest of the bits would go for the physical tag, (32 - 6 - 7 = 19).
For VIPT to work (That is the translation of the VA -> PA should happen in parallel to the Cache Index lookup), the bits comprising of the Index and the the Cache Line offset, should not change between the VA and PA).
Now, if we take an OS like Linux which uses pages of size 4KB, only the lower 12-bits are constant between the VA and the PA, but the above VIPT configuration described requires the lower 13 bits (7 bits for Index and 6 bits for Cache line offset) to be fixed. So in this case, how would VIPT work for the instruction cache?
thanks,
-Joel
joelagnel wrote: In Cortex-A8's architecture, I'm trying to understand why the I-cache is chosen to be in VIPT form (Virtually Indexed Physically Tagged), while the D-cache is PIPT (Physically Indexed Physically Tagged). I know the advantages and disadvantages of using either VIPT/PIPT, but why not make both caches VIPT, or both PIPT?
joelagnel wrote:
Actually, while ARM frequently describes the L1 data cache of the Cortex-A8 as PIPT, this is slightly misleading: it is in fact VIPT, but it has a built-in alias detection mechanism which evicts a cache line when it detects an access to the same physical line but different virtual index, thus preventing incoherency and maintaining the illusion of VIPT (albeit with some unexpected cache evictions). This mechanism can also be disabled by setting bit 0 of the Auxiliary Control Register.
Note btw that the difference between virtual and physical indexing disappears when the cache size per way is smaller than the granularity of address translation (i.e. the page size, 4 KB). Since the L1 caches are 4-way, this means that if configured as 16 KB there is no difference between VIPT and PIPT. If the alternative configuration option of 32 KB is used then there are still only two potential virtual indices for any physical index, which is why alias detection is relatively easy. [Edit: sorry, I clicked reply before I had read your entire post which shows you already know this]
Addendum: a consequence is that although page coloring restrictions aren't necessary for coherency like they would be with a plain VIPT data cache, violating them can have a performance impact which wouldn't be there with a true PIPT cache. It should be easy to demonstrate with something like:
volatile char *p = mmap( NULL, 4096, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0 );volatile char *q = mmap( p + 4096, 4096, PROT_READ | PROT_WRITE, MAP_SHARED | MAP_FIXED, fd, 0 );for( int i = 0; i < COUNT; i++ ) { ++*p; ++*q; }
to cause the cache line to ping-pong between two page colors. (untested code since I don't have a suitable device at hand)