Arm Community
Arm Community
  • Site
  • User
  • Site
  • Search
  • User
Arm Community blogs
Arm Community blogs
Architectures and Processors blog Locks, SWPs and two Smoking Barriers
  • Blogs
  • Mentions
  • Sub-Groups
  • Tags
  • Jump...
  • Cancel
More blogs in Arm Community blogs
  • AI blog

  • Announcements

  • Architectures and Processors blog

  • Automotive blog

  • Embedded and Microcontrollers blog

  • Internet of Things (IoT) blog

  • Laptops and Desktops blog

  • Mobile, Graphics, and Gaming blog

  • Operating Systems blog

  • Servers and Cloud Computing blog

  • SoC Design and Simulation blog

  • Tools, Software and IDEs blog

Tell us what you think
Tags
Actions
  • RSS
  • More
  • Cancel
Related blog posts
Related forum threads

Locks, SWPs and two Smoking Barriers

David Rusling
David Rusling
September 11, 2013
3 minute read time.

Before ARMv6, the main synchronisation mechanism was the SWP instruction. SWP has two aspects, in a uniprocessor system it allows the read and write operations not to be interrupted between them.   In a multiprocessor system it ensures that multiple masters will do the locking. For multiprocessor systems with complex memory hierarchies and long memory latencies SWP creates performance bottlenecks.

This was replaced in the ARMv6 architecture by exclusive loads and stores (LDREX and STREX).   This works on the principle of a monitor existing for the location in memory. This effectively tags the memory with the identity of the agent(s) trying to access it.  In a spinlock implementation, an exclusive load reads data from the memory, tagging it with its identifier. A short number of instructions later, it uses an exclusive store to write data to memory but this only works if the tag is still valid and the tag will only be valid if some other agent has not also modified that location since the exclusive load.

At the same time that the load and store exclusives were added to the ARM architecture, the SWP instruction was depreciated and the architecture notes that use of SWP is not guaranteed to work for SMP systems. The load and store exclusives and the deprecation of the SWP instructions is described in detail in the ARM ARM [1].

As an aid to removing legacy SWP instructions, ARMv7 allows you to disable the SWP instruction.   In ARM SMP Linux to help find legacy uses of the SWP instruction, we disable SWP but emulate the instructions (via the undefined instruction trap) and log those emulations. While this generates extra instruction overhead, it ensures that the software operates safely.  You should also be aware that the SWP instruction does not exist in the Thumb 2 instruction set and so will see errors if you try and assemble code containing the SWP instruction into the Thumb 2 instruction set.

For SMP performance we want to the replace the use of the SWP instruction with appropriate load and store exclusive instructions in libraries and applications. This easiest way to achieve this is to make use of the GCC compiler built-ins (described here) . ARM GCC will either directly generate the correct inline code or insert a call to a kernel user helper function containing the right code.

As an example, consider the following assembly code function implementing a spin lock:

CODE
ENTRY (__spin_lock)
    mov r1,#1
1:    swp r2,r1,[r0]
    teq r2,#0
    bne 1b
    mov r0,#0
END (__spin_lock)

This would be replaced with

CODE
typedef struct {int flag; ...} spinlock_t;int __spin_lock(spinlock_t *lock)
{
  while (__sync_lock_test_and_set(&lock->flag, 1));
  return 0;
}

To release the lock, you need to call another of the GCC builtins, in this case __sync_lock_release():

CODE
void _spin_unlock(spinlock_t *lock) {
  __sync_lock_release(&lock->flag);
}

Code wishing to lock a data structure would look something like this:

CODE
// grab the lock
__spin_lock(&lock);// modify the locked data structure
<modification code>// release the lock
__spin_unlock(&lock);



The built in functions take care of all of the details for you, including dealing with weakly ordered memory systems via memory barriers. This is only a brief introduction for more details I suggest that you read the "Barrier Litmus Tests and Cookbook" document in ARM's Infocenter.

In the next article, I explain how to implement spin locks in assembler and describe how memory barriers should be used.

References:

  1. ARM DDI 0406B_errata_2009_Q3 (ID100209) : ARM® Architecture Reference Manual ARM®v7-A and ARM®v7-R edition
  2. PRD03-GENC-007826 1.0 : Barrier Litmus Tests and Cookbook


David Rusling, ARM Fellow, David was born a few weeks before Sputnik was launched. He's always liked mathematics, but America's space program together with 'Star Trek' made him think that computers were really interesting and so he graduated in 1982 with a degree in Computer Science. The future turns out to have less flashing lights than he expected. After hacking networking boxes for Digital Equipment Corporation, he got involved in the port of Linux to the Alpha processor. This gave him an abiding respect for the power of open source in general and Linux in particular. He worked on StrongARM before moving to ARM where he added tools experience. He's an ARM Fellow; which he says, "really means that I'm a techno-dweeb with a wide freedom to meddle." His official role is to set the technical direction for ARM's tools and software story.

Anonymous
Architectures and Processors blog
  • Introducing GICv5: Scalable and secure interrupt management for Arm

    Christoffer Dall
    Christoffer Dall
    Introducing Arm GICv5: a scalable, hypervisor-free interrupt controller for modern multi-core systems with improved virtualization and real-time support.
    • April 28, 2025
  • Getting started with AARCHMRS Features.json using Python

    Joh
    Joh
    A high-level introduction to the Arm Architecture Machine Readable Specification (AARCHMRS) Features.json with some examples to interpret and start to work with the available data using Python.
    • April 8, 2025
  • Advancing server manageability on Arm Neoverse Compute Subsystem (CSS) with OpenBMC

    Samer El-Haj-Mahmoud
    Samer El-Haj-Mahmoud
    Arm and 9elements Cyber Security have brought a prototype of OpenBMC to the Arm Neoverse Compute Subsystem (CSS) to advancing server manageability.
    • January 28, 2025