Arm Community
Arm Community
  • Site
  • User
  • Site
  • Search
  • User
  • Groups
    • Research Collaboration and Enablement
    • DesignStart
    • Education Hub
    • Innovation
    • Open Source Software and Platforms
  • Forums
    • AI and ML forum
    • Architectures and Processors forum
    • Arm Development Platforms forum
    • Arm Development Studio forum
    • Arm Virtual Hardware forum
    • Automotive forum
    • Compilers and Libraries forum
    • Graphics, Gaming, and VR forum
    • High Performance Computing (HPC) forum
    • Infrastructure Solutions forum
    • Internet of Things (IoT) forum
    • Keil forum
    • Morello Forum
    • Operating Systems forum
    • SoC Design and Simulation forum
    • 中文社区论区
  • Blogs
    • AI and ML blog
    • Announcements
    • Architectures and Processors blog
    • Automotive blog
    • Graphics, Gaming, and VR blog
    • High Performance Computing (HPC) blog
    • Infrastructure Solutions blog
    • Innovation blog
    • Internet of Things (IoT) blog
    • Operating Systems blog
    • Research Articles
    • SoC Design and Simulation blog
    • Tools, Software and IDEs blog
    • 中文社区博客
  • Support
    • Arm Support Services
    • Documentation
    • Downloads
    • Training
    • Arm Approved program
    • Arm Design Reviews
  • Community Help
  • More
  • Cancel
Arm Community blogs
Arm Community blogs
Architectures and Processors blog Arm A-Profile Architecture Developments 2021
  • Blogs
  • Mentions
  • Sub-Groups
  • Tags
  • Jump...
  • Cancel
More blogs in Arm Community blogs
  • AI and ML blog

  • Announcements

  • Architectures and Processors blog

  • Automotive blog

  • Embedded blog

  • Graphics, Gaming, and VR blog

  • High Performance Computing (HPC) blog

  • Infrastructure Solutions blog

  • Internet of Things (IoT) blog

  • Operating Systems blog

  • SoC Design and Simulation blog

  • Tools, Software and IDEs blog

Tell us what you think
Tags
  • CPU Architecture
  • A-Profile CPU
  • Architectures
  • A-profile
Actions
  • RSS
  • More
  • Cancel
Related blog posts
Related forum threads

Arm A-Profile Architecture Developments 2021

Martin Weidmann
Martin Weidmann
September 8, 2021
3 minute read time.

Working with its architecture licensees and ecosystem partners, Arm continues to evolve its architecture, developing new functionality to meet the needs of both new and existing markets.

In this blog read about some of the key additions to the A-Profile architecture in 2021.

Full Instruction set and System register information is available from the end of September with our developer webpages. The complete Arm Architecture Reference Manual, documenting the 2021 extensions and earlier functionality, is due for release in early 2022. Updates to the Learn the Architecture pages will appear during 2021.

Details of previous updates to the A-Profile architecture are available here: 2014, 2015, 2016, 2017, 2018, 2019 and 2020.

Optimizing for the memcpy() family of functions

The memcpy()/memset() family of library functions are widely used in software. Having efficient implementations of these functions is an important part of a system’s performance.

The traditional RISC approach is to build operations such as memcpy() out of standard instructions, such as loads and stores. One issue with this approach is the optimal instruction sequence can vary depending on factors such as the micro-architecture, starting alignment and size of the operation. This means that it is common to find pre-amble code in libraries to select between a wide range of implementations. Adding to overhead and increasing the long-term maintenance cost for software.

To address these concerns the 2021 extensions introduce new instructions specifically targeting the memcpy() and memset() family of functions. 

memcpy()/memmove()

memset()

CPY[F]Px [dst]!,[src]!,num_bytes!
CPY[F]Mx [dst]!,[src]!,num_bytes!
CPY[F]Ex [dst]!,[src]!,num_bytes!

SETPx  [dst]!,num_bytes!,data
SETMx  [dst]!,num_bytes!,data
SETEx  [dst]!,num_bytes!,data

For software developers these instructions give the ability to write a standard optimized sequence that is portable across micro-architectures, alignments, and size. For hardware designers, the new instructions make it easier to detect memcpy()/memset() operations and therefore optimize for them.

Non-maskable interrupts

In the past some Arm processors, such as the Cortex-R4, have supported non-maskable interrupts (NMI), but they were not a standard architectural feature. This is changing with the 2021 extensions, with new support added in both the CPU and Generic Interrupt Controller (GIC) architectures.

GICv3.3 adds an NMI attribute that software can assign to interrupts. Interrupts with the NMI attribute are treated as the highest priority for the owning Security state, with different masking and pre-emptions rules:

Figure 1: Handling of non-maskable interrupts in the GIC and CPU

Within the CPU, NMIs are not subject to the existing PSTATE.I and PSTATE.F masks. Allowing NMIs to be taken as exceptions even when most interrupts are masked. Some masking of NMIs is necessary, for example on interrupt entry and exit to prevent corruption of return state. A new mask, PSTATE.AllInt, is added that masks all interrupts including NMIs. Software can also use the selected stack pointer, PSTATE.SP, as an implicit mask.

Performance Monitoring Unit (PMU) updates

The Performance Monitoring Unit (PMU) is an important tool for helping developers to understand how efficiently their code runs on Arm processors. 

The 2021 extensions add new PMU events for cache line state tracking. These events can be used to profile the accuracy of cache prefetching. Another set of PMU events is added for reporting where data is coming from on cache hits, giving information on the type and level of the cache.

Some PMU events can increment by more than 1 per cycle, for example the number of FP operations per cycle. The 2021 extensions introduce a new threshold control, which allows software to examine the distribution of these values, by creating a histogram profile.

Other functionality

Other features included in the 2021 extensions:

  • Hinted conditional branches.
  • QARMA3 algorithm for Pointer Authentication.
  • EL1 and EL2 traps on use of IMPDEF functionality at EL0.
  • Controls for EL0 cache maintenance operations.
  • BRBE extended to support EL3.

Summary

This blog provides a brief introduction to the latest features included in the Arm architecture as Armv8.8-A and Armv9.3-A. More detailed information can be found on our Developer website.

The next step will be working with our ecosystem partners, including Linaro, to ensure that open-source software is enabled, to make use of this functionality when the hardware becomes available.

Join me at Virtual Linaro Connect in September to learn more about the 2021 extensions and take part in the discussions.

Anonymous
  • 42Bastian Schick
    Offline 42Bastian Schick over 1 year ago

    Ouch, a maskable "NMI" :-) It should not be maskable in the core, but rather the GIC.

    • Cancel
    • Up 0 Down
    • Reply
    • More
    • Cancel
Architectures and Processors blog
  • What is new in LLVM 15?

    Pablo Barrio
    Pablo Barrio
    LLVM 15.0.0 was released on September 6, followed by a series of minor bug-fixing releases. Arm contributed support for new Arm extensions and CPUs.
    • February 27, 2023
  • Apache Arrow optimization on Arm

    Yibo Cai
    Yibo Cai
    This blog introduces Arm optimization practices with two solid examples from Apache Arrow project.
    • February 23, 2023
  • Optimizing TIFF image processing using AARCH64 (64-bit) Neon

    Ramin Zaghi
    Ramin Zaghi
    This guest blog shows how 64-bit Neon technology can be used to improve performance in image processing applications.
    • October 13, 2022