Arm Community
Arm Community
  • Site
  • User
  • Site
  • Search
  • User
Arm Community blogs
Arm Community blogs
Architectures and Processors blog Arm A-Profile Architecture Developments 2023
  • Blogs
  • Mentions
  • Sub-Groups
  • Tags
  • Jump...
  • Cancel
More blogs in Arm Community blogs
  • AI blog

  • Announcements

  • Architectures and Processors blog

  • Automotive blog

  • Embedded and Microcontrollers blog

  • Internet of Things (IoT) blog

  • Laptops and Desktops blog

  • Mobile, Graphics, and Gaming blog

  • Operating Systems blog

  • Servers and Cloud Computing blog

  • SoC Design and Simulation blog

  • Tools, Software and IDEs blog

Tell us what you think
Tags
  • Architecture
  • A-profile
Actions
  • RSS
  • More
  • Cancel
Related blog posts
Related forum threads

Arm A-Profile Architecture Developments 2023

Martin Weidmann
Martin Weidmann
October 5, 2023
6 minute read time.

As computing demands continue to evolve with the rise of artificial intelligence (AI) and advancing security threats, it is imperative that the foundational computing architecture at the heart of the world’s devices continues to evolve. This is why our engineering teams add new features and technologies to the pervasive Arm architecture, with the software teams then ensuring that software lands on these future features and technologies as seamlessly as possible.

How the Arm architecture is developed

Arm releases annual updates to the Arm Instruction Set Architecture (ISA) which are created in collaboration with our diverse set of partners from across the Arm ecosystem. The process involves silicon partners, operating system vendors and OEMs, Arm’s internal engineering teams and standards bodies.

A strongly curated ISA ensures that software continues to work on historic and new hardware for years to come. Arm works closely with Linaro and a host of other partners to enable the Arm ISA in the most widely used software upstream communities, such as Linux kernel and distros,  to help deliver the broadest developer ecosystem on the planet.

Each September, we release a blog which discusses some of the key additions to the A-Profile architecture in that year. Alongside the blog, we release full Instruction Set and System Register documentation via our developer web pages.

The complete Arm Architecture Reference Manual (Arm ARM) is also updated annually. An update to include the 2023 extensions is due for release in early 2024. Updates to the ‘Learn the Architecture’ pages will also appear during 2023 and 2024.

The Arm architecture journey

Publishing the blog and documentation is only one step in deploying new architecture. The next step will be working with our ecosystem partners to ensure that open-source software is enabled to make use of this functionality as soon as the hardware becomes available.

In 2023, Arm is introducing features to support our ongoing focus on artificial intelligence (AI), machine learning (ML) and security. Enabling secure AI everywhere is a key priority for the Arm architecture, with the training of Neural Networks (NNs) critical to the continued development and advancement of AI. This is why the 2023 architecture extensions include a new 8-bit floating-point format called FP8 that is already seeing rapid adoption across NNs. For security, we are adding Checked Pointer Arithmetic, which builds on existing support for Arm Memory Tagging Extension (MTE) that allows developers to detect memory safety violations quickly, saving them costs and time during the application development process.

Details of previous updates to the A-Profile architecture are available here: 2014, 2015, 2016, 2017, 2018, 2019, 2020, 2021 and 2022.

Let’s look at some of the new features we’ve added this year.

Floating Point 8 (FP8)

In 2022 Arm, Intel, and Nvidia announced their collaboration on FP8, an interchange format that allows software ecosystems to share NN models easily and support the continuous advancement of AI computing capabilities. As part of the 2023 extensions, FP8 support is added to SME2, SVE2 and Advanced SIMD (Neon). 

FP8 supports two data formats: E5M2 and E4M3. These two formats give different trade-offs between precision and range.

FP8 formats

The format which is used is selected by fields in the FPMR register. Different formats can be selected for the different inputs to an instruction, allowing for efficient working with datasets in different formats. We firmly believe in the benefits of the industry coalescing around one 8-bit floating point format, enabling developers to focus on innovation and differentiation where it really matters. We are excited to see how FP8 advances AI development in the future.

Live migration

Live migration is the process of moving a virtual machine (VM) from one host to another, while preserving availability and state. Support for efficient live migration is an important tool for large-scale data center management.

Live migration of a VM

To implement live migration, a hypervisor copies pages to the new host while the VM is still running on the old host. This typically requires an iterative process, as the VM might ‘dirty’ a page that has already been copied.  There are different approaches to solving this problem, but they must all contend with three challenges:

  • Recording: Creating a record of the pages the VM has written to (dirtied).
  • Surveying: Processing the records to determine which pages need to be re-copied.
  • Cleaning: Resetting the recoding mechanism on each iteration.

The 2023 extensions introduce features to help optimize all three of these.

FEAT_HDBSS adds the ability to record a log of the stage 2 pages or blocks dirtied. This mechanism addresses the Recording cost, as the memory management unit (MMU) can efficiently create the log without interrupting execution of the VM. The log also addresses the Surveying cost, as the generated data is in a format that hypervisors can efficiently consume.

FEAT_HDBSS

To address the Cleaning cost, FEAT_HACDBS adds an accelerator for cleaning the dirty state in the stage 2 translation tables. The engine uses the log of dirtied pages to locate the stage 2 translation table descriptors that need to be updated.

Together these features can give significant performance and efficiency improvements to live migration.

Checked Pointer Arithmetic

AArch64 supports features which re-purpose the upper bits of registers holding addresses. For example, Tagged Pointers introduced in Armv8.0-A and MTE introduced in Armv8.5-A.

Software frequently needs to manipulate pointers, for example adding an offset to a base address.  This is typically done using regular arithmetic operations, such as add or subtract. An overflow on the address calculation could lead to the non-address bits being corrupted. For example, if MTE is being used, the address manipulation could cause the Tag stored in the pointer to be changed. A corrupted tag might lead to the processor not detecting a memory safety violation as illustrated below:

Corrupted tag leading to processor not detecting a memory safety violation

The 2023 extensions introduce new instructions specifically intended for operating on pointers.  These instructions incorporate multiple pointer specific checks, including checking whether bits[63:56] are modified and protected against overflow. Load and store instructions with <base+offset> addressing modes can also be configured to preserve bits[63:56]. 

Taking the previous MTE example, the new features allow the processor to detect if the top 8 bits of the pointer have been modified. This means that if the MTE tag were corrupted it would be reported back to software.

Other functionality

Other enhancements introduced as part of the 2023 extensions include:

  • Support for using a combination of the PC (Program Counter) and the SP (the currently selected Stack Pointer) as the modifier when generating or checking Pointer Authentication codes.
  • Support for Realm Management Extension (RME) enabled designs, support for non-secure only in the Granule Protection Tables and the ability to disable certain Physical Address Spaces (PAS).
  • EL3 configuration write-traps.
  • Breakpoint support for address range and mismatch triggering without the need for linking.
  • Support for efficiently delegating SErrors from EL3 to EL2 or EL1.

Summary

This blog provides a brief introduction to the latest features included in the Arm architecture as Armv9.5-A. More detailed information can be found on our Developer website.

Over the coming months, Arm will be working with our partners to ensure that the software ecosystem is enabled to utilize these features as soon as future processors become available.

Anonymous
  • daith
    daith 6 months ago

    The first half of the first letter on each line is missing. This happens on Chrome and Edge. The corresponding pages for 2022 and 2024 are fine.

    • Cancel
    • Up 0 Down
    • Reply
    • More
    • Cancel
Architectures and Processors blog
  • Introducing GICv5: Scalable and secure interrupt management for Arm

    Christoffer Dall
    Christoffer Dall
    Introducing Arm GICv5: a scalable, hypervisor-free interrupt controller for modern multi-core systems with improved virtualization and real-time support.
    • April 28, 2025
  • Getting started with AARCHMRS Features.json using Python

    Joh
    Joh
    A high-level introduction to the Arm Architecture Machine Readable Specification (AARCHMRS) Features.json with some examples to interpret and start to work with the available data using Python.
    • April 8, 2025
  • Advancing server manageability on Arm Neoverse Compute Subsystem (CSS) with OpenBMC

    Samer El-Haj-Mahmoud
    Samer El-Haj-Mahmoud
    Arm and 9elements Cyber Security have brought a prototype of OpenBMC to the Arm Neoverse Compute Subsystem (CSS) to advancing server manageability.
    • January 28, 2025