Hi,
Iam wondering if it makes sense to have a memory barrier after access to a memory mapped register. I looking at a driver, unfortunately not open source, that has a memory barrier after a read from the interrupt status register of the peripheral when processing the interrupt. I understand the use of the barrier when accessing main RAM, but does it make sense for registers?
CPU: A53 aarch64
Thanks
Hi Joseph,
I was happy with your answers, finally I have an idea of how things work with memory ordering on arm until I yesterday came by an this article: https://www.embedded.com/design/programming-languages-and-tools/4437925/3/Dealing-with-memory-access-ordering-in-complex-embedded-designs-
Page 3 has an example similar to the question in this thread:
-------------------- Quote start
volatile unit32 control; //write register to reset device volatile uint32 status; //read register to access status uint32 x; control = 1; // reset device // some code x = status; // read status while ((x & 1) != 1) { x = status; }
-------------------- Quote end
according to your answers a barrier is needed after setting the control and before the "some code", since arm can call the Normal memory "some code" before the Device/SO memory of control. But since the "status" and "control" are both in Device/SO memory, their order would not be changed (Hope I understood your answers correctly).
However, The article mentioned says a barrier is still needed to ensure that the control is always called before status:
You might think all is well. At least the programmer has declared the memory-mapped peripheral registers using the volatile keyword. But presumably, the write to the control register should complete before we read the status register. Otherwise the device will not be reset properly before we access it. This code does not guarantee that. For reasons we have seen above, the compiler may promote the LDR from the status port above the STR to the control port because load latency is longer than store. Similarly, a multi-issue out-of-order execution unit in the processor might issue them in a different order. We need some way of ensuring that they happen in the order in which they are written.
If true it mean a barrier is still needed between device memory accesses! Does that makes sense to you?
Thanks for the help
Hi there,
If the two accesses are to the same device, architecturally the access order is guranteed. So when handling peripheral programming sequence, there is no need to insert memory barrier between each access.
The situation is a bit more complex if the two accesses are to two different bus slaves. As seem in table A3-8 of Armv7-M architecture reference manual (link), if two accesses are both device type, but different attribute (one shareable and one non-shareable), it is allowed to have the two accesses being out of order. In the example code from the page you mentioned, it is unclear that whether the two registers are from the same slave. Also, in the context of peripheral reset, the software developer might need to take into account of peripheral/device specific behavior : if the reset does take multiple clock cycles, use of the memory barrier by itself might not be enough.
regards,
Joseph
Thank you again Joseph Yiu for the clarification. wish the article was more detailed.