This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

[ARMV8] dmb nshld vs dmb ishld -- practical differences?

Hello arm experts,

I am trying to understand when a load access of a memory location might produce side effects that other observers in the system may care about. So far all the examples I can find around dmb memory barriers in the ARMV8 reference material, are focused on observability of *writes*, whose importance and shareability domains are fairly self-explanatory. What I have not been able to find, is an example of when one might prefer dmb ishld over dmb nshld, for example. Whether the memory address is in shareable memory or not, or visible to coherent caches or not, surely a read access cannot produce observable effects that would affect the correctness of the PE executing the dmb instruction?

If this is correct, then why does ARMV8 offer various domains instead of simply some dmb ld with the least restrictive domain possible? And, if this is not correct, then what would be a practical example where the difference between dmb nshld, dmb ishld, and dmb oshld, would matter?

Thanks!

Parents Reply Children
  • If I understand correctly, in the extended example above, we would expect the following:

    P0's write to the flag must not be re-ordered relative to its first write of the message.

    This would be satisfied by a DMB ISH or DMB ISHST instruction on P0 between writing the message and writing the flag.

    P1's read of the message must not be re-ordered relative to the read of the flag.

    This could be satisfied by a DMB NSHLD instruction on P1 between loading the flag and loading the message (i.e. we would not need to use DMB ISH or DMB ISHLD to ensure P0 observed P1 loading the flag.)

    P1's write to the flag must not be re-ordered relative to its read of message.

    This could be satisfied by a DMB NSHLD instruction on P1 between loading the message and writing the flag. Aside: If P1 were to produce some other state where observers expected to see the flag cleared before seeing state from P1, then P1 should use DMB ISHST after writing the flag, or, perhaps write the flag using STLR.

    • P0's second write to message must no be re-ordered relative to its reads of the cleared flag.

    And this could be satisfied by a DMB NSHx instruction on P0 between loading the flag and writing the message (i.e. we would not need to use DMB ISHx to ensure P1 observed P0 loading the flag.) But, you would probably use DMB ISHx here just to prevent the second message write from being observed before the first message write (depending exactly how you wrote your flag polling loop on P0.)


    I don't know if that's what you meant by one PE observing another's reads.

    Not quite I don't think. Some more context here might help to clarify. My team would like to ensure that a given PE does not re-order its own loads relative to each other -- you could say it is quite like P1 in the original mailbox example. We would like to use the least restrictive barrier possible for this, and based on the documentation and our own tests, DMB NSHLD appears to be sufficient for this. But, a question has been raised as to what exactly are the effects of P1 loading a value that other observers can observe, and, if there are any practical cases where observers could need to see those effects (thus necessitating the use of DMB ISHLD or DMB OSHLD instead.)