Hello arm experts,
I am trying to understand when a load access of a memory location might produce side effects that other observers in the system may care about. So far all the examples I can find around dmb memory barriers in the ARMV8 reference material, are focused on observability of *writes*, whose importance and shareability domains are fairly self-explanatory. What I have not been able to find, is an example of when one might prefer dmb ishld over dmb nshld, for example. Whether the memory address is in shareable memory or not, or visible to coherent caches or not, surely a read access cannot produce observable effects that would affect the correctness of the PE executing the dmb instruction?
dmb
dmb ishld
dmb nshld
If this is correct, then why does ARMV8 offer various domains instead of simply some dmb ld with the least restrictive domain possible? And, if this is not correct, then what would be a practical example where the difference between dmb nshld, dmb ishld, and dmb oshld, would matter?
dmb ld
dmb oshld
Thanks!
Thanks for the examples Martin! I wanted to test if DMB NSHx would at least barrier accesses on the self-same PE, so I tried:
AArch64 MP"PodWW Rfe PodRR Fre"Cycle=Rfe PodRR Fre PodWWGenerator=diycross7 (version 7.54+01(dev))Prefetch=0:x=F,0:y=W,1:y=F,1:x=TCom=Rf FrOrig=PodWW Rfe PodRR Fre{0:X1=x; 0:X3=y;1:X1=y; 1:X3=x;} P0 | P1 ; MOV W0,#1 | LDR W0,[X1] ; STR W0,[X1] | DMB NSHLD ; DMB ISHST | LDR W2,[X3] ; MOV W2,#1 | ; STR W2,[X3] | ;exists(1:X0=1 /\ 1:X2=0)
AArch64 MP
"PodWW Rfe PodRR Fre"
Cycle=Rfe PodRR Fre PodWW
Generator=diycross7 (version 7.54+01(dev))
Prefetch=0:x=F,0:y=W,1:y=F,1:x=T
Com=Rf Fr
Orig=PodWW Rfe PodRR Fre
{
0:X1=x; 0:X3=y;
1:X1=y; 1:X3=x;
}
P0 | P1 ;
MOV W0,#1 | LDR W0,[X1] ;
STR W0,[X1] | DMB NSHLD ;
DMB ISHST | LDR W2,[X3] ;
MOV W2,#1 | ;
STR W2,[X3] | ;
exists
(1:X0=1 /\ 1:X2=0)
And the output was:
Test MP AllowedStates 41:X0=0; 1:X2=0;1:X0=0; 1:X2=1;1:X0=1; 1:X2=0;1:X0=1; 1:X2=1;OkWitnessesPositive: 1 Negative: 3Flag Assuming-common-inner-shareable-domainCondition exists (1:X0=1 /\ 1:X2=0)Observation MP Sometimes 1 3
Looking through armfences.cat and aarch64fences.cat, it looks like DMB NSHx are not actually implemented in the simulator. Is that correct?
I am trying to determine if there's a practical case where PE-X could depend on having observed a certain order of loads by PE-Y, or if DMB NSHLD would be generally safe to use.