The ARMv8-A architecture and its ongoing development

 ARMv8-A, the ARMv8 A-profile version of the ARM architecture, was first publically previewed in October 2011. Over the past two years, there have been a growing number of ARMv8-A announcements from ARM, such as its Cortex-A53 and Cortex-A57 products, plus additional cores and end-user devices from licensees and OEMs. Many of these products are in, or entering, volume production today. As reported in the Q3-2014 financial results, ARM has signed 57 ARMv8-A processor and architecture licenses, meaning there are many more ARMv8-A based processors and products under development that will appear over the next 1-2 years.

Architecture evolves with constant requests for additions and refinements. To allow the ARM ecosystem to manage the next stage of its evolution, ARM is introducing a set of small scale enhancements that are fully backwards compatible with the initial v8.0 architecture, and will be collectively known as ARMv8.1-A. These have been developed in conjunction with the ARM partnership and will start to appear in public specifications, software development tools, models and software support throughout 2015, with early adopter silicon expected in the latter part of 2015. More details will emerge from ARM and its partners as products are introduced. It is important to recognize that introduction of these enhancements into new cores will take several years, and other design choices can have a much greater impact on system performance. Some markets and use cases, such as mobile, are expected to see little benefit from these changes. This means that v8.0 will continue to be the architecture of choice for many new designs and most software development over the medium term, and that v8.1 will have a gradual affect across different market segments, starting with very large systems. Many of the changes will be transparent to the user, with operating systems such as Linux using runtime library selection or kernel patches to adapt where necessary.

For a summary of the ARMv8-A architecture, see the section on ARMv8 architectural concepts in Chapter A1 of the ARMv8-A Architecture Reference Manual. This document, ARM DDI 0487, can be downloaded from by following the links from the top level => ARM architecture => reference manuals section.

ARMv8.1 overview

The enhancements introduced with ARMv8.1 fall into two categories:

  • Changes to the instruction set.
  • Changes to the exception model and memory translation.

Instruction set enhancements

ARMv8.1 includes the following additions to the A64 instruction set:

  • A set of AArch64 atomic read-write instructions
  • Additions to the Advanced SIMD instruction set for both AArch32 and AArch64 to enable opportunities for some library optimizations:
    • Signed Saturating Rounding Doubling Multiply Accumulate, Returning High Half
    • Signed Saturating Rounding Doubling Multiply Subtract, Returning High Half
    • The instructions are added in vector and scalar forms.
  • A set of AArch64 load and store instructions that can provide memory access order that is limited to configurable address regions.

As well as the additions, the optional CRC instructions in v8.0 become a requirement in ARMv8.1.

The atomic instructions can be used as an alternative to Load-exclusive/Store-exclusive instructions, by example to ease the implementation of atomic memory updates in very large systems. This could be in a closely coupled cache, sometimes referred to as near atomics, or further out in the memory system as far atomics. The instructions provide atomic update of register content with memory for a range of conditions:

  • Compare and swap of 8-, 16-, 32-, 64- or a pair of 32- or 64-bit registers as a conditional update of a value in memory.
  • ADD, BitClear, ExclusiveOR, BitSet signed and unsigned MAXimum or MINimum value data processing operations on -8, 16-, 32- or 64-bit values in memory. These can occur with or without copying the original value in memory to a register.
  • Swap of an 8-, 16-, 32- or 64-bit value between a register and value in memory.
  • The instructions also include controls associated with influencing the order properties, based on acquire and release semantics.

The limited order (LO) support is in two parts:

  • System registers configure one or more memory LORegions with a minimum resolution of 64Kbytes.
  • LoadLOAcquire and StoreLORelease instructions for 8-, 16-, 32- and 64-bit values are added, and can be used instead of the global ARMv8 LoadAcquire and StoreRelease instructions.

Exception Model and Translation System enhancements

Additions associated with the exception and memory model are:

  • A new Privileged Access Never (PAN) state bit. This bit provides control that prevents privileged access to user data unless explicitly enabled; an additional security mechanism against possible software attacks.
  • An increased VMID range for virtualization; supports a larger number of virtual machines.
  • Optional support for hardware update of the page table access flag, and the standardization of an optional, hardware updated, dirty bit mechanism.
  • The Virtualization Host Extensions (VHE). These enhancements improve the performance of Type 2 hypervisors by reducing the software overhead associated when transitioning between the Host and Guest operating systems. The extensions allow the Host OS to execute at EL2, as opposed to EL1, without substantial modification.
  • A mechanism to free up some translation table bits for operating system use, where the hardware support is not needed by the OS.

Finally, some new events are added to the Performance Monitor Unit (PMU) to better support profiling in operating systems such as the perf utility in Linux.


The ARM architecture, in line with other processor architectures, is evolving with time. ARMv8.1 is the first set of changes that ARM is introducing to the latest version of its ARMv8 A-profile architecture, grouped to help the eco-system manage tools and software support alongside the large numbers of ARMv8-A based processors and products in development or production today. These changes provide incremental benefits over v8.0, and as such, will appear as a gradual migration in cores and related products over several years. It should be noted that other design choices by silicon partners can have a much greater impact than the choice between v8 versus v8.1, and consequently we expect both to co-exist in the market for many years to come.  Public specifications will be supplied to support initial product introductions mid-2015, with some early visibility through tools and software starting now. Partners can currently obtain more details under a confidentiality agreement through their sales and support channels.

David Brash is Architecture Program Director in the Architecture and Technology Group, one of several groups within ARM’s engineering community.