I have been reading through the ARM documentation on memory and instruction barriers.
I have read that the single core ARMv7-M parts do not reorder instructions, as such the DSB and ISB are not needed, is this correct?
I have also read the same about DMB, however their is a concern across clock domains. For example if peripheral is running at lower clock speed then writing to the peripheral could take a long time. For example imagine that you clear and interrupt flag in peripheral during ISR, then if the clear does not happen before exiting the ISR it could trigger falsely trigger ISR again. Hence I was wondering if DMB would fix the problem across clock domains like this? Then if so does this only happen with core data cache enabled?
That is how can the core change the volatile flag register but write takes longer unless the peripheral has a local cache, in which case how does the core know the peripheral local cache register has been applied?
So in what cases when you have a single core part would DSB be appropriate in code? Then the same for DSB an ISB?
Hi Trampas,
If the peripheral is running at 32KHz (e.g. RTC), then it is likely that the bus interface is going to be running on a different clock domain (it means the RTC is divided into two clock domains), and the registers would be shadowed. It is extremely unlikely that the access will take 1000+ cycles.
Please note Cortex-M3/M4 has a write buffer. A write operation could take a number of clock cycles but the subsequent instruction could start to execute before the write completed. The write buffer in Cortex-M3/M4 is single entry, and if there is a dummy data read, the read cannot be accepted until the write is completed (nature of the AHB protocol). Also, in the case of Cortex-M3/M4, exception entry / return cannot start until the buffered data write is completed. In Cortex-M3/M4, issuing a DSB ensure the write buffer is drained before next instruction (could be any instruction for DSB). A DMB could also be used if you just want to make sure the next data memory access doesn't start until the buffer write is completed.
In Cortex-M7, the write buffer is multi-entries, and some of the peripherals can be connected by AXI, which support multiple outstanding transfers. Unlike Cortex-M3/M4, the write buffer doesn't have to be drained before exception entry/exit - you might want to use DSB to drain the write buffer, but a dummy read could be better - I will explain later. Similar to Cortex-M3/M4, you can use DSB and DMB in the same way.
Most of the peripherals are connected using AHB or APB bus protocols. These protocols doesn't allow multiple outstanding transfers and reordering between transfers (AXI allows these). So in most cases a dummy read ensure that the bus transfer is actually completed at peripheral bus interface level.
*** The reason I said using a dummy read rather than using DSB is that : DSB would not help in the case where there is write buffer(s) in the system level AHB to APB or AXI to AHB/APB bus bridge(s). (We've seen several support cases on the issue that interrupt carried out twice due to delay in system level write buffers).
You are right that the register accessed by the dummy read need to be defined as volatile - otherwise the C compiler will optimize the code away. C compilers should not reorder volatile data accesses, so no need to put memory barrier between peripheral accesses.
When you read back the flag register - whether you read back the value in the fast clock domain or the value in the 32KHz clock domain is completely dependent on the peripheral design. I don't think there is any rule on that. Usually MCU vendors provide example codes for their peripherals, and I think they should should provide more details in documentation to explain their designs.
In the case that the interrupt line actually will take even longer to be de-asserted at the processor side, for example, a clock domain crossing interface for the interrupt signal might cause a few clock cycles of delay in interrupt de-assertion, then MCU vendors should provide a solution for software developer to check if the interrupt line is actually de-asserted. (e.g. a status register to read the interrupt line).
>Of course I guess this is highly depended on how bad the vendor did their peripherals ...
LOL :-)
regards,
Joseph