Working with its architecture licensees and ecosystem partners, Arm continues to evolve its architecture, developing new functionality to meet the needs of both new and existing markets.
In this blog read about some of the key additions to the A-Profile architecture in 2021.
Full Instruction set and System register information is available from the end of September with our developer webpages. The complete Arm Architecture Reference Manual, documenting the 2021 extensions and earlier functionality, is due for release in early 2022. Updates to the Learn the Architecture pages will appear during 2021.
Details of previous updates to the A-Profile architecture are available here: 2014, 2015, 2016, 2017, 2018, 2019 and 2020.
The memcpy()/memset() family of library functions are widely used in software. Having efficient implementations of these functions is an important part of a system’s performance.
The traditional RISC approach is to build operations such as memcpy() out of standard instructions, such as loads and stores. One issue with this approach is the optimal instruction sequence can vary depending on factors such as the micro-architecture, starting alignment and size of the operation. This means that it is common to find pre-amble code in libraries to select between a wide range of implementations. Adding to overhead and increasing the long-term maintenance cost for software.
To address these concerns the 2021 extensions introduce new instructions specifically targeting the memcpy() and memset() family of functions.
CPY[F]Px [dst]!,[src]!,num_bytes! CPY[F]Mx [dst]!,[src]!,num_bytes! CPY[F]Ex [dst]!,[src]!,num_bytes!
SETPx [dst]!,num_bytes!,data SETMx [dst]!,num_bytes!,data SETEx [dst]!,num_bytes!,data
For software developers these instructions give the ability to write a standard optimized sequence that is portable across micro-architectures, alignments, and size. For hardware designers, the new instructions make it easier to detect memcpy()/memset() operations and therefore optimize for them.
In the past some Arm processors, such as the Cortex-R4, have supported non-maskable interrupts (NMI), but they were not a standard architectural feature. This is changing with the 2021 extensions, with new support added in both the CPU and Generic Interrupt Controller (GIC) architectures.
GICv3.3 adds an NMI attribute that software can assign to interrupts. Interrupts with the NMI attribute are treated as the highest priority for the owning Security state, with different masking and pre-emptions rules:
Figure 1: Handling of non-maskable interrupts in the GIC and CPU
Within the CPU, NMIs are not subject to the existing PSTATE.I and PSTATE.F masks. Allowing NMIs to be taken as exceptions even when most interrupts are masked. Some masking of NMIs is necessary, for example on interrupt entry and exit to prevent corruption of return state. A new mask, PSTATE.AllInt, is added that masks all interrupts including NMIs. Software can also use the selected stack pointer, PSTATE.SP, as an implicit mask.
The Performance Monitoring Unit (PMU) is an important tool for helping developers to understand how efficiently their code runs on Arm processors.
The 2021 extensions add new PMU events for cache line state tracking. These events can be used to profile the accuracy of cache prefetching. Another set of PMU events is added for reporting where data is coming from on cache hits, giving information on the type and level of the cache.
Some PMU events can increment by more than 1 per cycle, for example the number of FP operations per cycle. The 2021 extensions introduce a new threshold control, which allows software to examine the distribution of these values, by creating a histogram profile.
Other features included in the 2021 extensions:
This blog provides a brief introduction to the latest features included in the Arm architecture as Armv8.8-A and Armv9.3-A. More detailed information can be found on our Developer website.
The next step will be working with our ecosystem partners, including Linaro, to ensure that open-source software is enabled, to make use of this functionality when the hardware becomes available.
Join me at Virtual Linaro Connect in September to learn more about the 2021 extensions and take part in the discussions.
Ouch, a maskable "NMI" :-) It should not be maskable in the core, but rather the GIC.