This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Cortex-A9 accessing atomic variable results in dead loop

I'm using ZYNQ 7000 and implementing a counter in a bare-metal system. A counter in the shared memory is increased by both cores. _Atomic is used so that the 2 cores can be synchronized. But accessing the atomic variable results in dead loop.

The code

// shared_mem.h
#include <stdint.h>
#define SHARED_MEM_BASE_ADDR 0xffff0000
typedef struct {
	uint32_t basicCounter;
	_Atomic uint32_t atomicCounter;
} SharedMem;
#define SHAERD_MEM ((volatile SharedMem *)SHARED_MEM_BASE_ADDR)
 
// main.c, core0 and core1
#include "shared_mem.h"
int debuggerflag = 0; // set to 1 by debugger, each core has one flag
#define INC_VALUE 10000
void IncreaseCounters(){
	for(unsigned i=0; i< INC_VALUE; ++i){
		++SHAERD_MEM->basicCounter;
	}
	for(unsigned i=0; i< INC_VALUE; ++i){
		++SHAERD_MEM->atomicCounter;
		// blocks here if Xil_SetTlbAttributes(0xffff0000, 0x14de2); has been called
	}
}
int main(){
	for(;;){
		if(debuggerflag) {
			IncreaseCounters();
			debuggerflag = 0;
		}
	}
}

The problem I met:

  1. The counter value is only visible to one core. After running IncreaseCounters() in core0, core1 still sees the value 0 in shared memory.
  2. ​The default MMU config is “S=b0 TEX=b100 AP=b11, Domain=b0, C=b1, B=b1" as in translation_table.S. If `Xil_SetTlbAttributes(0xffff0000, 0x14de2)` in xil_mmu.c  is added to make the config "
    S=b1 TEX=b100 AP=b11, Domain=b1111, C=b0, B=b0" as in Xilinx xapp1079, then the counter value is visible to the other core. But the increament of the plain counter is not synchronized. The increment of the atomic counter loops forever. The assembly of `++SHAERD_MEM->atomicCounter` is the following (in which strex always fails):

        dmb     ish				; data memory barrier
.L2:
        ldrex   r3, [r0]		; exclusive load, r3=*r0
        add     r3, r3, #1		; increment
        strex   r2, r3, [r0]	; exclusive store, *r0=r3, write result in r2
        cmp     r2, #0			; r2==0 means exclusive store succeeds
        bne     .L2				; retry if fails
        dmb     ish

Is this a config problem of MMU? How can the counter be synchronized in the 2 cores? Thanks very much

Parents Reply Children
  • In general cross-core atomics need the memory to be configured as "normal memory", and to be inner shareable. If you have cores in two clusters you will need the memory to be outer shareable, and the memory region used must support a global exclusives monitor. 

  • Assuming not using TEX remap, my assumption is that your system has no global monitor, which is why the exclusives always fail on uncached memory. Do either of these work (these mark the memory as cached with/without write allocate)?

    • S=b1 TEX=b101 AP=b11, Domain=b1111, C=b0, B=b1
    • S=b1 TEX=b111 AP=b11, Domain=b1111, C=b1, B=b1
  • The debugger shows cp15.c1.scltr is 0x08c5187d, TRE bit is 0, TEX remap is not enabled. Both of the attributes work (the second argument of Xil_SetTlbAttribute being 0x15de6 or 0x17dee), thanks so much!