This discussion has been locked.

You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Barrier after access to memory mapped register?

Hi,

Iam wondering if it makes sense to have a memory barrier after access to a memory mapped register. I looking at a driver, unfortunately not open source, that has a memory barrier after a read from the interrupt status register of the peripheral when processing the interrupt. I understand the use of the barrier when accessing main RAM, but does it make sense for registers?

CPU: A53 aarch64

Thanks

Top replies

Joseph Yiu over 5 years ago +1 verified

If the second access following the register read is to a "normal" memory then it make perfect sense. The barrier prevents the processor from issuing the second access too early (e.g. the address location...

+1 Joseph Yiu over 5 years ago

If the second access following the register read is to a "normal" memory then it make perfect sense.

The barrier prevents the processor from issuing the second access too early (e.g. the address location could be updated by an Interrupt Service Routine related to the first access).

regards,

Joseph
Cancel
Up +1 Down

Cancel
0 42Bastian Schick over 5 years ago in reply to Joseph Yiu

Joseph, but, if the memory mapped device is strongly ordered memory, then an memory barrier gives no benefit, right? Even speculative reads should not happen in strongly ordered memory. At least, that's what I thought.
Cancel
Up 0 Down

Cancel
0 Joseph Yiu over 5 years ago in reply to 42Bastian Schick

Hi Bastian,

Strongly ordered (SO) memory accesses have ordering requirements against accesses to other SO or Device address locations, but not against other "Normal" memory accesses. See table 2 in the following link:

infocenter.arm.com/.../BIHJIIIC.html

This document is for Cortex-M but the same memory ordering concept present in Cortex-A and Cortex-R.

regards,

Joseph

Update: I forgot to mention that in Armv7-M, there was a change in the memory ordering requirement.

In revision C of the Armv7-M Architecture Reference Manual, SO accesses do have ordering restrictions with Normal memory accesses.

In revision D of the Armv7-M Architecture Reference Manual, this is changed to align with other Arm processor designs.
Cancel
Up 0 Down

Cancel
0 42Bastian Schick over 5 years ago in reply to Joseph Yiu

Thanks Joseph, I missed the "strongly" vs. "normal" point.
Cancel
Up 0 Down

Cancel
0 dedoz over 5 years ago in reply to Joseph Yiu

Thank you Joseph. As I suspected the barrier is not needed in this case. the code in the interrupt enable routine:

1. Enables the interrupts in the peripheral interrupts register

2. DMB

3. Checks the interrupts status register in the peripherals

4. if there is interrupt pending it processes the interrupt.

So, apparently the code suspected that the interrupt could happen between the enable and the check in the status register.
Cancel
Up 0 Down

Cancel
0 Joseph Yiu over 5 years ago in reply to dedoz

Yes. Since the two accesses are going to the same peripherals there is no architectural requirement for a DMB there. However, it might be a peripheral specific requirement (i.e. to introduce a few cycle of timing delay) to allow the peripheral status to be updated. Potentially a high-end processor might throw away NOPs instructions but will not throw away a memory barrier instruction.
Cancel
Up 0 Down

Cancel
0 dedoz over 5 years ago in reply to Joseph Yiu

Hi Joseph,

I was happy with your answers, finally I have an idea of how things work with memory ordering on arm until I yesterday came by an this article: https://www.embedded.com/design/programming-languages-and-tools/4437925/3/Dealing-with-memory-access-ordering-in-complex-embedded-designs-

Page 3 has an example similar to the question in this thread:

-------------------- Quote start

volatile unit32 control; //write register to reset device
     volatile uint32 status; //read register to access status
     uint32 x;

     control = 1; // reset device

     // some code

     x = status; // read status while ((x & 1) != 1)
     {
          x = status;
     }

-------------------- Quote end

according to your answers a barrier is needed after setting the control and before the "some code", since arm can call the Normal memory "some code" before the Device/SO memory of control. But since the "status" and "control" are both in Device/SO memory, their order would not be changed (Hope I understood your answers correctly).

However, The article mentioned says a barrier is still needed to ensure that the control is always called before status:

-------------------- Quote start

You might think all is well. At least the programmer has declared the memory-mapped peripheral registers using the volatile keyword. But presumably, the write to the control register should complete before we read the status register. Otherwise the device will not be reset properly before we access it. This code does not guarantee that.

For reasons we have seen above, the compiler may promote the LDR from the status port above the STR to the control port because load latency is longer than store. Similarly, a multi-issue out-of-order execution unit in the processor might issue them in a different order. We need some way of ensuring that they happen in the order in which they are written.

-------------------- Quote end

If true it mean a barrier is still needed between device memory accesses! Does that makes sense to you?

Thanks for the help
Cancel
Up 0 Down

Cancel
0 Joseph Yiu over 5 years ago in reply to dedoz

Hi there,

If the two accesses are to the same device, architecturally the access order is guranteed. So when handling peripheral programming sequence, there is no need to insert memory barrier between each access.

The situation is a bit more complex if the two accesses are to two different bus slaves. As seem in table A3-8 of Armv7-M architecture reference manual (link), if two accesses are both device type, but different attribute (one shareable and one non-shareable), it is allowed to have the two accesses being out of order. In the example code from the page you mentioned, it is unclear that whether the two registers are from the same slave. Also, in the context of peripheral reset, the software developer might need to take into account of peripheral/device specific behavior : if the reset does take multiple clock cycles, use of the memory barrier by itself might not be enough.

regards,

Joseph
Cancel
Up 0 Down

Cancel
0 dedoz over 5 years ago in reply to Joseph Yiu

Thank you again Joseph Yiu for the clarification. wish the article was more detailed.
Cancel
Up 0 Down

Cancel