This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Armv7 ICIALLU vs ICIALLUIS

Hi experts! I have a question about cache instruction.

DDI0406C_b_arm_architecture_reference_manual for Armv7  says

Effect of the Multiprocessing Extensions on All and set/way maintenance operations

The only architectural guarantee for the following instructions is that they apply to the caches or

branch predictors of the processor that performs the operation:

• Invalidate entire instruction cache, ICIALLU

• Invalidate all branch predictors, BPIALL

• Clean and Invalidate data or unified cache line by set/way, DCCISW

• Clean data or unified cache line by set/way, DCCSW

• Invalidate data or unified cache line by set/way, DCISW.

That is, these operations have an effect only on the processor that performs the operation.

In case of ICIALLU(Instruction Cache Invalidate All to PoU),

as I know, processors in same cluster share L2 cache and L2 is commonly PoU

so if ICIALLU have effect to PoU then I guess this will effect other processors.

I don't understand Why it says "operations have an effect only on the processor that performs the operation."

plus if snoop and CCI control cache properly then the other processors will have effect.

Parents
  • ICIALLUIS is a broadcast version of ICIALLU.

    Invalidate Instruction cache to PoU, means that the caches will invalidated to (aka, as far as) the point of unification.  And therefore new data will be fetched from the PoU, or beyond.

    Take Cortex-A15 as an example.  Each core has private L1 data and instruction caches, and all the cores share the unified L2 cache.  The PoU is the first point the different interfaces of a core access the same copy of an address.  This is the L2 cache (as at L1 there are separate I and D caches).  An invalidate to PoU means you invalidate as far as the L2 cache, but the L2 cache does not itself need to be invalidated.  Hence, the ICIALLU hasn't had a direct effect on the other cores.

    On the other hand, ICIALLUIS is broadcast to all the cores in the same inner shareable domain.  In our Cortex-A15 example, that means that all the cores invalidate their L1 instruction cache and re-fill from L2 (or the memory system).

    You should note the general statements about caching in the ARM ARM.  An unlocked line is not guaranteed to stay in the cache, it can be evicted by the processor speculatively.  Therefore, it's entirely possible the other cores in a cluster might have evicted a given line from the cache without the ICIALLU being broadcast. 

Reply
  • ICIALLUIS is a broadcast version of ICIALLU.

    Invalidate Instruction cache to PoU, means that the caches will invalidated to (aka, as far as) the point of unification.  And therefore new data will be fetched from the PoU, or beyond.

    Take Cortex-A15 as an example.  Each core has private L1 data and instruction caches, and all the cores share the unified L2 cache.  The PoU is the first point the different interfaces of a core access the same copy of an address.  This is the L2 cache (as at L1 there are separate I and D caches).  An invalidate to PoU means you invalidate as far as the L2 cache, but the L2 cache does not itself need to be invalidated.  Hence, the ICIALLU hasn't had a direct effect on the other cores.

    On the other hand, ICIALLUIS is broadcast to all the cores in the same inner shareable domain.  In our Cortex-A15 example, that means that all the cores invalidate their L1 instruction cache and re-fill from L2 (or the memory system).

    You should note the general statements about caching in the ARM ARM.  An unlocked line is not guaranteed to stay in the cache, it can be evicted by the processor speculatively.  Therefore, it's entirely possible the other cores in a cluster might have evicted a given line from the cache without the ICIALLU being broadcast. 

Children