Is it possible to have a normal store operation between LDX and STX operations from the same core/master?
Placing an explicit memory access between a LoadExcl-StoreExcl pair violates one of the necessary conditions, placed on the LoadExcl-StoreExcl loop of a single thread, upon the fulfillment of which the forward progress of such loops is architecturally guaranteed. Presence of such an access implies that the architecture is no longer bound to uphold the guarantee, nor is the software in a position to demand it while still maintaining its ability to run unmodified, as is generally expected of it, on any conforming implementation.
The section "Load-Exclusive and Store-Exclusive instruction usage restrictions" in the armv8 manual has details.
The state machine diagrams for local and global monitors, and other statements, show the extent to which the monitor-behaviour can be defined by an implementation.
For instance, when the local monitor of a PE is in the exclusive access state, and the PE writes to an address (marked by the monitor, or not) using any instruction other than Store-Excl, the resultant state of the monitor is implementation-defined.
When a PE writes using any instruction other than a Store-Exclusive instruction:
- If the write is to a PA that is not marked as Exclusive Access by its local monitor and that local monitor is in the Exclusive Access state, it is IMPLEMENTATION DEFINED whether the write affects the state of the local monitor.
- If the write is to a PA that is marked as Exclusive Access by its local monitor, it is IMPLEMENTATION DEFINED whether the write affects the state of the local monitor.
Not placing such a write between LoadExcl-StoreExcl implies that the software is independent of an implementation's behaviour around this particular situation (and possibly many other situations that result from the presence of such writes).