This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Spin-lock implementation for Aarch64 -- how to enforce acquire semantics?

Here is a minimal C implementation of a spinlock "lock" operation using GCC's built-in atomics:

#include <stdbool.h>

void spin_lock(bool *l) {
  while (__atomic_test_and_set(l, __ATOMIC_ACQUIRE))
    ;
}

I am concerned by GCC's output when compiling for Aarch64:

spin_lock:
    mov    w2, 1
    .p2align 2
.L4:
    ldaxrb    w1, [x0]
    stxrb    w3, w2, [x0]
    cbnz    w3, .L4
    uxtb    w1, w1
    cbnz    w1, .L4
    ret

The ldaxrb surely prevents subsequent memory accesses from being reordered before it, but, to my understanding, nothing prevents those accesses from being reordered between the ldaxrb and stxrb. If I understand correctly, the acquire barrier should be placed after stxrb, not before.

When compiling for ARM, however, GCC correctly inserts a dmb after strexb:

spin_lock:
    mov    r2, #1
.L4:
    ldrexb    r3, [r0]
    strexb    r1, r2, [r0]
    cmp    r1, #0
    bne    .L4
    tst    r3, #255
    dmb    sy
    bne    .L4
    bx    lr

Am I missing something? If GCC's output for Aarch64 is correct, could anyone explain what forces the acquire memory ordering I specified? In the opposite case, what would be a correct solution (beside GCC's solution for ARM)?

I am using Linaro's gcc-linaro-5.3-2016.02-x86_64_aarch64-elf and gcc-linaro-4.9-2015.02-3-x86_64_arm-eabi toolchains.

Parents Reply Children
No data