Hi,
Iam wondering if it makes sense to have a memory barrier after access to a memory mapped register. I looking at a driver, unfortunately not open source, that has a memory barrier after a read from the interrupt status register of the peripheral when processing the interrupt. I understand the use of the barrier when accessing main RAM, but does it make sense for registers?
CPU: A53 aarch64
Thanks
If the second access following the register read is to a "normal" memory then it make perfect sense.
The barrier prevents the processor from issuing the second access too early (e.g. the address location could be updated by an Interrupt Service Routine related to the first access).
regards,
Joseph
Joseph, but, if the memory mapped device is strongly ordered memory, then an memory barrier gives no benefit, right? Even speculative reads should not happen in strongly ordered memory. At least, that's what I thought.
Hi Bastian,
Strongly ordered (SO) memory accesses have ordering requirements against accesses to other SO or Device address locations, but not against other "Normal" memory accesses. See table 2 in the following link:
infocenter.arm.com/.../BIHJIIIC.html
This document is for Cortex-M but the same memory ordering concept present in Cortex-A and Cortex-R.
Update: I forgot to mention that in Armv7-M, there was a change in the memory ordering requirement.
In revision C of the Armv7-M Architecture Reference Manual, SO accesses do have ordering restrictions with Normal memory accesses.
In revision D of the Armv7-M Architecture Reference Manual, this is changed to align with other Arm processor designs.
Thanks Joseph, I missed the "strongly" vs. "normal" point.
Thank you Joseph. As I suspected the barrier is not needed in this case. the code in the interrupt enable routine:
1. Enables the interrupts in the peripheral interrupts register
2. DMB
3. Checks the interrupts status register in the peripherals
4. if there is interrupt pending it processes the interrupt.
So, apparently the code suspected that the interrupt could happen between the enable and the check in the status register.
Yes. Since the two accesses are going to the same peripherals there is no architectural requirement for a DMB there. However, it might be a peripheral specific requirement (i.e. to introduce a few cycle of timing delay) to allow the peripheral status to be updated. Potentially a high-end processor might throw away NOPs instructions but will not throw away a memory barrier instruction.
Hi Joseph,
I was happy with your answers, finally I have an idea of how things work with memory ordering on arm until I yesterday came by an this article: https://www.embedded.com/design/programming-languages-and-tools/4437925/3/Dealing-with-memory-access-ordering-in-complex-embedded-designs-
Page 3 has an example similar to the question in this thread:
-------------------- Quote start
volatile unit32 control; //write register to reset device volatile uint32 status; //read register to access status uint32 x; control = 1; // reset device // some code x = status; // read status while ((x & 1) != 1) { x = status; }
-------------------- Quote end
according to your answers a barrier is needed after setting the control and before the "some code", since arm can call the Normal memory "some code" before the Device/SO memory of control. But since the "status" and "control" are both in Device/SO memory, their order would not be changed (Hope I understood your answers correctly).
However, The article mentioned says a barrier is still needed to ensure that the control is always called before status:
You might think all is well. At least the programmer has declared the memory-mapped peripheral registers using the volatile keyword. But presumably, the write to the control register should complete before we read the status register. Otherwise the device will not be reset properly before we access it. This code does not guarantee that. For reasons we have seen above, the compiler may promote the LDR from the status port above the STR to the control port because load latency is longer than store. Similarly, a multi-issue out-of-order execution unit in the processor might issue them in a different order. We need some way of ensuring that they happen in the order in which they are written.
If true it mean a barrier is still needed between device memory accesses! Does that makes sense to you?
Thanks for the help
Hi there,
If the two accesses are to the same device, architecturally the access order is guranteed. So when handling peripheral programming sequence, there is no need to insert memory barrier between each access.
The situation is a bit more complex if the two accesses are to two different bus slaves. As seem in table A3-8 of Armv7-M architecture reference manual (link), if two accesses are both device type, but different attribute (one shareable and one non-shareable), it is allowed to have the two accesses being out of order. In the example code from the page you mentioned, it is unclear that whether the two registers are from the same slave. Also, in the context of peripheral reset, the software developer might need to take into account of peripheral/device specific behavior : if the reset does take multiple clock cycles, use of the memory barrier by itself might not be enough.
Thank you again Joseph Yiu for the clarification. wish the article was more detailed.