We have a section of code that increments a variable shared among several threads. The code section is protected by a ldxr/stxr/dmb spin lock, and there is another dmb after the shared variable is updated. This code sequence is the body of a function that returns the shared variable's value before the update. The expectation is that no two calls to the function will return the same value.
.text ## lap for synchtest ## 8 threads call this 100 times each, concurrently. ## x1 -> quadword used as a spin lock ## x2 -> quadword supposed to source a series of distinct ## values. ## .align 4 .global synchtest .type synchtest, %function synchtest: movz x20,6 mov x12,sp str x30,[sp,8] sub sp,sp,144 str x12,[sp,0] str x20,[sp,16] ## Enter ctitical section control ## (loop until [x0] goes 0->1) L15182: movz x27,1 add x12,x0,0 L15185: ldxr x11,[x12,0] cmp xzr,x11 bne L15186 stxr w10,x27,[x12,0] dmb 11 cbnz w10,L15185 L15186: bne L15182 ## This is the critical section ## increment shared global [x1] ldr x27,[x1,0] sub x23,x27,-8 str x23,[x1,0] ## Now leave the critical section add x12,x0,0 str xzr,[x12,0] dmb 11 ## and return the original value from ## shared global [x1] str x27,[sp,136] ldr x0,[sp,136] movz x9,8 ldr x30,[sp,152] ldr x20,[sp,160] str x24,[sp,32] add sp,sp,144 ret .size synchtest,.-synchtest
We start 8 threads running concurrently, calling the function 100 times each and record the results separately for each thread. Then we compare the sequences of values each thread received and check for duplications, where two threads got the same value. [EDIT: added: ] Our expectation is that there will be no duplicates. We find this is not the case.
Btw, we're on ARMv8 with CentOS 7.x (x = latest).
How did I miss that!, yes of course the synchronization should have been applied before freeing the semaphore. Thanks
I just tried googling 'ARM mutex' and it brought up
http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dht0008a/ch01s03s02.html
which shows this for the 32 bit ARM code but there's no real difference.