This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

DIC/IDC bit in CTR

 Hello, 

In armv8 Spec, CTR_EL0.DIC/IDC is described as follows:

I really don't get the point of the two bits.

Can someone give me a  scenario to  explain how the two bits affect?

really appreciated.

  • The scenario is explained in the section "E2.2.5 Concurrent modification and execution of instructions" of the armv8 manual, under a comment "Coherency example for self-modifying code". The instructions that IDC and DIC target are respectively the DCache Clean and the ICache Invalidate seen in the code sequence associated with that example.

    These bits, when set to 1, allow the software to skip/remove instructions which otherwise would be necessary to establish coherency between the DCache and the ICache.

    These bits are how an armv8 implementation informs the software whether it needs to Clean the DCache and Invalidate the ICache when the CPU wants to execute the instruction stream which it wrote (under a normal, cache-enabled scenario).

    In Cortex-A72 TRM, these bits are RES0, which means that, on A72, the software must perform both DCache Clean (unless the other conditions about LoC or about LoUIS && LoUIU apply) and and ICache Invalidate to the PoU before the CPU can execute the instruction stream it wrote. Performing DCache Clean and ICache Invalidate targets the weakest of the implementations; that code sequence should work without change even on implementations which provide stronger guaratees (i.e. set either or both of IDC/DIC to 1)

    In Cortex-A76, the IDC bit can be 1. If it is 1, the software does not need to run DCache Clean (but presumably must still run ICache Invalidate since A76 has DIC bit as RES0).

    On Linux, one can see how IDC bit being 1 can help reduce code:

    SYM_FUNC_START(dcache_clean_pou)
    alternative_if ARM64_HAS_CACHE_IDC
            dsb     ishst
            ret
    alternative_else_nop_endif
            dcache_by_line_op cvau, ish, x0, x1, x2, x3
            ret
    SYM_FUNC_END(dcache_clean_pou)

    With IDC facility avilable, all the dcache_clean_pou needs to do is to ensure that the writes are complete, and any automatic DCache clean maintenance (presumed to be provided by the IDC facility) is also complete. Without the IDC facility, the OS must perform the DCache Clean to PoU.

    Similarly when DIC facility is available (not shown above), the OS just needs to flush its pipeline and refetch from ICache by calling ISB - the ICache invalidation has been automatically handled in the background by the hardware, for instance.

  • I have read the linux code.

    Can I say that  DCache clean will not be needed for any cache coherency situation since the hardware will do it when IDC = 1?

  • Can I say that  DCache clean will not be needed for any cache coherency situation since the hardware will do it when IDC = 1?

    One cannot say that.

    The IDC/DIC facility is limited to establishing coherency between the DCache and the ICache. The facility does not dictate rules for other  situations dealing with coherency.

    For e.g., when cacheable data is to be read by a device through DMA, it is typically required to clean the data cache upto the PoC, so that the device reads the updated data and not the stale data. The IDC bit does not cast influence over this coherency situation (unless, for instance, PoC is the same as PoU, etc.).

  • Thank your for the detailed reply.

    So the key point is DCache and ICache coherency. It seems that this kind of coherency is not common.

    Except self-modifying code, is there any other DCache-ICache coherency situation we should pay attention to?

  • I would say moving code from disk to RAM is such a situation.

  • 42Bastian Schick is correct.

    The situation also arises in case of dynamic compilation (C#, or emulation/virtualization like QEMU), or in case of debugging (setting a breakpoint usually requires modifying the instruction stream).