Arm Community
Arm Community
  • Site
  • User
  • Site
  • Search
  • User
Arm Community blogs
Arm Community blogs
Servers and Cloud Computing blog Arm Compiler for Linux and Arm Performance Libraries 24.04
  • Blogs
  • Mentions
  • Sub-Groups
  • Tags
  • Jump...
  • Cancel
More blogs in Arm Community blogs
  • AI blog

  • Announcements

  • Architectures and Processors blog

  • Automotive blog

  • Embedded and Microcontrollers blog

  • Internet of Things (IoT) blog

  • Laptops and Desktops blog

  • Mobile, Graphics, and Gaming blog

  • Operating Systems blog

  • Servers and Cloud Computing blog

  • SoC Design and Simulation blog

  • Tools, Software and IDEs blog

Tags
  • Arm Compiler for Linux
  • HPC Compiler
  • Server and Infrastructure
Actions
  • RSS
  • More
  • Cancel
Related blog posts
Related forum threads

Arm Compiler for Linux and Arm Performance Libraries 24.04

Chris Goodyer
Chris Goodyer
April 16, 2024
7 minute read time.

Arm release the latest version of Arm Compiler for Linux twice a year. This includes a full user space compilation toolchain for Linux-based environments for software written in C, C++ and Fortran. In addition, the package also includes Arm Performance Libraries, the vendor library containing optimized implementations of sparse and dense linear algebra functions, FFTs and math.h functions.

In April 2024, we have released the 24.04 version of both compilers and libraries. It is available for free and can be downloaded here. In this blog post, we outline some of the biggest changes in this release. Note that the next release, 24.10, is expected in October 2024.

Arm Performance Libraries versions are also released as standalone downloads for Linux (compatible with GCC and NVHPC), macOS (compatible with Clang) and Windows (compatible with Clang and MSVC).

What is new in Arm Compiler for Linux 24.04?

A summary of the key features in this release: 

Compiler:

  • Updated the base compiler technology to LLVM 18 (from LLVM 17) bringing performance and stability improvements.
  • Improved performance of hyperbolic functions when called from Fortran
  • The default setting of the compiler has changed to -fno-math-errno at -O1 and above; in addition, -fmath-errno now implies -fveclib=none as the vector math functions are not compatible with -fmath-errno.
  • Improved support for vectorizing loops with math functions (fmod), and functions containing linear pointers
  • Beta- quality implementation of the SME and SME2 ACLE
  • Code quality improvements to fix a regression in sw4lite

Libraries:

  • Addition of Random Number Generators
  • Many new Arm-based systems supported with appropriate optimizations
  • Improvements in performance across the components

 Package:

  • Red Hat Enterprise Linux 7 (RHEL-7) is now deprecated. Support will be removed from the Arm Compiler for Linux 24.10 release
  • The package also include GCC 13, which has many improvements over GCC 12 shipped in the previous release

Full release notes can be found online.

Compiler benchmark improvements

Arm Compiler for Linux (ACfL) 24.04 brings some performance improvements on a number of workloads. The graph below shows the performance of SPEC2k17 benchmarked on a Neoverse V1 system. Results are compared to the ACfL 23.04 release.

Arm Compiler for Linux Performance on SPEC 2k17: 24.04 vs 23.04

This second graph shows ACfL performance over a number of industry standard applications running over 64 cores on an AWS c7g.16xlarge. It can be seen that ACfL 24.04 delivers some noticeable improvements over ACfL 23.04.

Arm Compiler for Linux Performance on HPC workloads: 24.04 vs 23.04

Arm Performance Libraries 24.04

Arm Performance Libraries (Arm PL) provides optimized standard core math libraries for numerical applications on 64-bit Arm (AArch64) processors. These are built with OpenMP parallelism for BLAS, LAPACK, FFT, and sparse routines to maximize performance in multi-processor environments. The libraries are available for Linux, macOS and Windows.

Version 24.04 features, in addition to regular performance improvements, new functions for generating random numbers, tuned support for the latest AArch64 systems, and improved compatibility with GCC for stand-alone Arm PL releases.

Arm PL includes libamath, a library containing optimized scalar and vector math.h functions, and from 24.04 this is available on Windows for the first time.

New: Random Number Generators

Arm Performance Libraries 24.04 includes the interface to the random number generation part of the VSL library developed by Intel® and shipped for x86 processors as part of oneMKL. We are grateful to Intel® for having released this interface, along with their documentation, to us under a Creative Commons 4.0 license, allowing us to develop our own implementation of this functionality for users of Arm-based systems, enabling software portability between architectures.

By linking to Arm PL, users can now generate random values by selecting from different basic random number generators (pseudorandom, quasirandom or non-deterministic generators are supported) and then generate a stream of random values according to a chosen distribution. Both continuous distributions (such as Gaussian and uniform), and discrete distributions (such as Bernoulli and binomial) are supported. We have provided complete documentation for our implementations, including an overview of which features are, and are not, supported in this first release.

We have endeavored to ensure that the same generators and initializations are used as documented in the oneMKL documentation. This means that functions which return bit sequences are bitwise reproducible between Arm and x86 systems. If an integer or floating-point answer is requested answers may differ as the precision of various operations is different between the two libraries.

Note that in this release not all of the random number functions from VSL have been included. These functions are listed in the documentation as not being currently implemented. We are intending to fill out this coverage in future releases, and we are very keen to hear from users who find missing functionality that they would like us to prioritize.

The following chart demonstrates the benefit of having the VSL RNGs available on Arm for machine learning using PyTorch. PyTorch can be configured to use the VSL RNGs from Arm Performance Libraries as part of the dropout layer, drawing random values from a Bernoulli distribution. For example, for a batch size of 16, with an input tensor [16, 128, 3072] we see a 4x performance improvement when running sequentially compared with using the default RNGs within PyTorch. In addition, when the VSL interface is enabled, PyTorch calls the skip-ahead function vslSkipAheadStream to allow the parallel generation of random values. If VSL is not used, then the random values are always generated sequentially, without parallelism. Unlocking parallelism for the input tensor [16, 128, 3072] using 16 threads improves performance even further to around 44 times faster with Arm PL than the default in the dropout layer.

Arm PL 24.04: RNG performance in PyTorch

Support for new systems

From 24.04 the libraries have been tuned to run efficiently on the following new Arm systems:

  • NVIDIA Grace and AWS Graviton4, based on Arm Neoverse V2
  • Microsoft Cobalt 100 and Alibaba Cloud Yitian 710, based on Arm Neoverse N2
  • AmpereOne

This is in addition to the systems previously tuned-for:

  • AWS Graviton3, based on Arm Neoverse V1
  • AWS Graviton2 and Ampere Altra and Altra Max, based on Arm Neoverse N1
  • Fujitsu A64FX

GCC compatibility

Previously, the standalone version of Arm Performance Libraries for Linux was available as separate downloads for each supported Linux distribution and for each of the supported major versions of GCC.  However, this approach does not scale well as we add support for newer distributions and newer versions of GCC. As we move to supporting GCC 13 this time, so we have chosen to simplify the download options for Linux users, without loss of support for compiler version or OS. There are now just two downloadable packages: one for RPM distributions and one for .deb distributions.

Users should download the RPM package if they are using one of the following supported distributions:

  • Amazon Linux 2, Amazon Linux 2023
  • RHEL-7, RHEL-8, RHEL-9
  • SLES-15

Users should download the .deb package if they are using one of the supported Ubuntu distributions:

  • Ubuntu 20
  • Ubuntu 22

The version of Arm PL is exactly the same in each GCC-compatible package, and it is supported to work with versions of GCC from 7 through to 13.

Note: we also support the NVIDIA HPC compiler (NVHPC) in a similar way, providing Arm PL RPM and .deb packages which are compatible with NVHPC 24.1. Go to the Arm Performance Libraries downloads page to access the standalone versions of Arm PL for Linux, as well as Windows and macOS.

Get Started guides for all platforms

With Arm PL now available across multiple platforms, we provide a separate "Getting Started" guide for each to explain the basics. These are short guides available on developer.arm.com as either web pages or PDFs. We recommend downloading the PDF versions of the files for reference:

  • Arm PL in Arm Compiler for Linux
  • Standalone Arm Performance Libraries for Linux
  • Arm Performance Libraries for macOS
  • Arm Performance Libraries for Windows

Users are also referred to the Arm Performance Libraries Reference Guide for complete documentation of all of the functions provided in the libraries.

More HPC blog posts

Anonymous
Servers and Cloud Computing blog
  • Hands-on with MPAM: Deploying and verifying on Ubuntu

    Howard Zhang
    Howard Zhang
    In this blog post, Howard Zhang walks through how to configure and verify MPAM on Ubuntu Linux.
    • September 24, 2025
  • DPDK scalability analysis on Arm Neoverse V2

    Doug Foster
    Doug Foster
    Deep dive into DPDK performance on Arm Neoverse V2, analyzing system bottlenecks and providing guidance on optimizing performance.
    • September 23, 2025
  • Out-of-band telemetry on Arm Neoverse based servers

    Samer El-Haj-Mahmoud
    Samer El-Haj-Mahmoud
    Arm and Insyde advance out-of-band telemetry on Neoverse servers, enabling scalable, real-time datacenter insights via open standards and fleet analytics.
    • September 17, 2025