Arm processors are central to today’s AI-enabled software, interpreting and executing instructions. The Arm Instruction Set Architecture (ISA) is the bridge between hardware and software. It continuously evolves to meet modern demands in AI, machine learning, chiplet adoption, and security. Ongoing innovation ensures Arm remains performant, efficient, secure, and flexible for developers.
Each year, Arm publishes updates to the A-Profile architecture alongside full Instruction Set and System Register documentation. In 2025, the update is Armv9.7-A.
Arm works with our ecosystem to accelerate ISA enablement in widely used software upstream communities, such as the Linux kernel and distros. This work supports the world’s broadest developer base.
Previous updates to the A-Profile architecture are available here: 2014, 2015, 2016, 2017, 2018, 2019, 2020, 2021, 2022, 2023, and 2024.
Let us look at some of the new features added this year.
The Arm architecture provides a common, compatible foundation for software across a wide range of implementations, each tailored to specific markets. From ultra-efficient sensors to high-performance supercomputers, Arm processors power a vast spectrum of devices. This broad reach highlights the scalability of the architecture, a defining theme in Arm’s 2025 updates.
When an operating system (OS) or hypervisor changes address mappings in the Memory Management Unit (MMU), any stale cached copies of old translations need to be invalidated. On Arm, Translation-Lookaside Buffer (TLB) Invalidation operations are broadcast, so that all cores clear the affected entries from their TLBs. Efficiently handling these broadcasts is crucial as systems grow to use multiple chiplets or chips, where latency can be an issue.Armv9.7-A allows software to be more precise and efficient when needing to clear core TLBs by grouping them into Domains. This segments the system so that TLBI broadcasts are only sent to cores where the affected workload ran.
The diagram above shows an example system using Domains. Domains can be defined per chiplet, across multiple chiplets, or the entire system. This lets software specify where invalidations occur. This reduces latency and improves performance in multi-chip systems.
Memory System Resource Partitioning and Monitoring (MPAM) support was introduced in Armv8.4-A. The 2025 extensions introduce MPAMv2, delivering:
MPAM uses two IDs to partition and monitor resources.
Initially, PMGs could only act as filters tied to a PARTID. This meant performance profiling was limited to activity within that partition. MPAMv2 makes PARTIDs and PMGs independent. It also increases the maximum size of PMGs to 16 bits. This enables system-wide profiling without being dependent on the partitioning scheme. It gives system operators more flexibility and insight.
In the original MPAM, only 32 PARTIDs could be virtualized. PMGs could not. MPAMv2 fixes both limits by using in-memory ID translation. It removes the cap and enables independent virtualization of PARTIDs and PMGs.
These updates help ensure predictable performance across diverse workloads through better control and monitoring. For more information on MPAM, see the Learn Architecture guides
As AI workloads advance, Arm evolves the architecture to support developers. The 2025 Armv9.7-A update adds new Scalable Vector Extension (SVE) and Scalable Matrix Extension (SME) instructions to efficiently work with 6-bit data types.
This includes the OCP MXFP6 format, a compact 6 bit floating point standard from the Open Compute Project. It improves efficiency in AI models by reducing memory use and bandwidth needs.
Global video traffic volume has grown rapidly in recent years, especially on mobile devices. Much of the processing of that video will be done on CPU rather than dedicated accelerators. The 2025 extensions add new instructions to improve performance and efficiency of video codecs, including:
These updates improve performance and efficiency for video processing.
Other enhancements introduced as part of the 2024 extensions include:
Also released this year is GICv5, the next generation of the most widely used interrupt controller for Arm A-profile and R-profile systems. GICv5 is a re-architected design that meets the demands of modern computing. It supports scaling from single die to multi-chiplet and multi-chip systems. It also improves virtualization efficiency. Find out more from Christoffer’s blog post or read the spec.
Armv9.7-A brings new capabilities across scalability, resource management, AI, video, and security; helping developers and system designers build faster, more efficient, and more secure systems. From targeted TLB invalidations to flexible resource monitoring, from 6‑bit AI datatypes to optimized video codecs, these changes are designed to meet the demands of modern computing.
We continue to work with our ecosystem to ensure these features are quickly enabled in software and available in future processors. As Arm architecture evolving each year, developers can rely on a strong, forward‑looking foundation for innovation across markets.
For deeper technical details on Armv9.7-A, visit our Arm Developer website.
Arm Developer