This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

DIC/IDC bit in CTR

 Hello, 

In armv8 Spec, CTR_EL0.DIC/IDC is described as follows:

I really don't get the point of the two bits.

Can someone give me a  scenario to  explain how the two bits affect?

really appreciated.

Parents
  • The scenario is explained in the section "E2.2.5 Concurrent modification and execution of instructions" of the armv8 manual, under a comment "Coherency example for self-modifying code". The instructions that IDC and DIC target are respectively the DCache Clean and the ICache Invalidate seen in the code sequence associated with that example.

    These bits, when set to 1, allow the software to skip/remove instructions which otherwise would be necessary to establish coherency between the DCache and the ICache.

    These bits are how an armv8 implementation informs the software whether it needs to Clean the DCache and Invalidate the ICache when the CPU wants to execute the instruction stream which it wrote (under a normal, cache-enabled scenario).

    In Cortex-A72 TRM, these bits are RES0, which means that, on A72, the software must perform both DCache Clean (unless the other conditions about LoC or about LoUIS && LoUIU apply) and and ICache Invalidate to the PoU before the CPU can execute the instruction stream it wrote. Performing DCache Clean and ICache Invalidate targets the weakest of the implementations; that code sequence should work without change even on implementations which provide stronger guaratees (i.e. set either or both of IDC/DIC to 1)

    In Cortex-A76, the IDC bit can be 1. If it is 1, the software does not need to run DCache Clean (but presumably must still run ICache Invalidate since A76 has DIC bit as RES0).

    On Linux, one can see how IDC bit being 1 can help reduce code:

    SYM_FUNC_START(dcache_clean_pou)
    alternative_if ARM64_HAS_CACHE_IDC
            dsb     ishst
            ret
    alternative_else_nop_endif
            dcache_by_line_op cvau, ish, x0, x1, x2, x3
            ret
    SYM_FUNC_END(dcache_clean_pou)

    With IDC facility avilable, all the dcache_clean_pou needs to do is to ensure that the writes are complete, and any automatic DCache clean maintenance (presumed to be provided by the IDC facility) is also complete. Without the IDC facility, the OS must perform the DCache Clean to PoU.

    Similarly when DIC facility is available (not shown above), the OS just needs to flush its pipeline and refetch from ICache by calling ISB - the ICache invalidation has been automatically handled in the background by the hardware, for instance.

Reply
  • The scenario is explained in the section "E2.2.5 Concurrent modification and execution of instructions" of the armv8 manual, under a comment "Coherency example for self-modifying code". The instructions that IDC and DIC target are respectively the DCache Clean and the ICache Invalidate seen in the code sequence associated with that example.

    These bits, when set to 1, allow the software to skip/remove instructions which otherwise would be necessary to establish coherency between the DCache and the ICache.

    These bits are how an armv8 implementation informs the software whether it needs to Clean the DCache and Invalidate the ICache when the CPU wants to execute the instruction stream which it wrote (under a normal, cache-enabled scenario).

    In Cortex-A72 TRM, these bits are RES0, which means that, on A72, the software must perform both DCache Clean (unless the other conditions about LoC or about LoUIS && LoUIU apply) and and ICache Invalidate to the PoU before the CPU can execute the instruction stream it wrote. Performing DCache Clean and ICache Invalidate targets the weakest of the implementations; that code sequence should work without change even on implementations which provide stronger guaratees (i.e. set either or both of IDC/DIC to 1)

    In Cortex-A76, the IDC bit can be 1. If it is 1, the software does not need to run DCache Clean (but presumably must still run ICache Invalidate since A76 has DIC bit as RES0).

    On Linux, one can see how IDC bit being 1 can help reduce code:

    SYM_FUNC_START(dcache_clean_pou)
    alternative_if ARM64_HAS_CACHE_IDC
            dsb     ishst
            ret
    alternative_else_nop_endif
            dcache_by_line_op cvau, ish, x0, x1, x2, x3
            ret
    SYM_FUNC_END(dcache_clean_pou)

    With IDC facility avilable, all the dcache_clean_pou needs to do is to ensure that the writes are complete, and any automatic DCache clean maintenance (presumed to be provided by the IDC facility) is also complete. Without the IDC facility, the OS must perform the DCache Clean to PoU.

    Similarly when DIC facility is available (not shown above), the OS just needs to flush its pipeline and refetch from ICache by calling ISB - the ICache invalidation has been automatically handled in the background by the hardware, for instance.

Children