353 static inline u64 atomic64_cmpxchg(atomic64_t *ptr, u64 old, u64 new)354 {355 u64 oldval;356 unsigned long res;357358 smp_mb();359360 do {361 __asm__ __volatile__("@ atomic64_cmpxchg\n"362 "ldrexd %1, %H1, [%3]\n"363 "mov %0, #0\n"364 "teq %1, %4\n"365 "teqeq %H1, %H4\n"366 "strexdeq %0, %5, %H5, [%3]"367 : "=&r" (res), "=&r" (oldval), "+Qo" (ptr->counter)368 : "r" (&ptr->counter), "r" (old), "r" (new)369 : "cc");370 } while (res);371372 smp_mb();373374 return oldval;375 }
Now if I understand your question correctly, the scenario you have in mind is a little different. You're asking what would happen if the other thread were to change the value of the pointer itself (so making it point to a different address). Ok, simple answer is that no number of barriers is going to fix this. In the first example the thing shared between the threads is the memory being pointed at - not the pointer itself. Hence we use synchronization controls for accessing the memory, but not the pointer. If the pointer itself is also a shared resource that can be modified, then access to it _also_ requires synchronization (e.g. a mutex).
The main issue seems to be a misunderstanding about coherent caches work in an SMP system. The CPU hardware means that these "just work"; dirty lines for shared memory are migrated between cores as needed to ensure each core always sees the latest data. If one core explicitly performs a cache invalidate and the data is needed, then that is a software bug, but for CPU-to-CPU coherency in SMP this is not needed.
I may very well be wrong
Acquire lock protecting structure A Load or store some value from structure A Release lock
Load or store some value from structure A Acquire lock protecting structure A Release lock
Load or store some value from structure A dmb Acquire lock protecting structure A dmb Release lock