Arm neural technology in ExecuTorch 1.0

October 22, 2025

4 minute read time.

With the announcement of Arm neural technology, Arm is enabling neural networks and a new class of neural graphics capabilities to run efficiently on mobile GPUs. Neural Super Sampling (NSS), denoising, and machine learning (ML)-powered rendering enhancements are just the start.

Today, we are excited to share that ExecuTorch, through the new ExecuTorch General Availability (GA) release, now includes support for Arm Neural Technology through the backend. This backend provides a complete ahead-of-time (AOT) export and runtime execution path for a large set of neural networks that will target Arm’s next-generation neural GPU acceleration. It also enables export for direct use in game engines.

This builds on the foundations Arm has prepared over the past few years. A key enabler is TOSA (Tensor Operator Set Architecture), which standardizes ML operators for acceleration on Arm platforms. Thanks to TOSA, ExecuTorch could already target Arm Ethos-U-based devices — and that same infrastructure now extends forward to future neural technology capable hardware, providing:

Over 80% edge operator coverage, capturing most major networks seen today.
Support for many network architectures, covering neural graphics, convolutional, generative, language models, and more.

The TOSA standard provides consistent behavior and high performance across our accelerator technology, be that Ethos-U or the neural technology in our future Arm GPUs. In addition, the suite of open-source software for using TOSA (including compilers, torch.fx passes, and an MLIR dialect) makes it easy to use in an interoperable way with both PyTorch and ExecuTorch.

This continues the story we told in earlier blogs: ExecuTorch and TOSA and ExecuTorch support for Ethos-U85.

Diagram showing Neural Technology Flow

What makes this new support possible is the VGF backend. It introduces an ahead-of-time compilation flow and runtime integration that bridge the gap between PyTorch models and efficient deployment on neural technology capable hardware. The backend provides tooling to export models as portable files and to run them through the ExecuTorch runtime and execute them on a VGF emulator. This makes it possible to develop networks on a standard ML development platform and target any hardware with future Arm GPUs.

Getting started

To run this example, you need the following packages:

pip install executorch
./examples/arm/setup.sh --i-agree-to-the-contained-eula --disable-ethos-u-deps --enable-mlsdk-deps

Then the following Python example will produce an exported model:

You can also explore the example models in the Executorch examples/models tree:

python3 -m examples.arm.aot_arm_compiler -t vgf --delegate --model_name="add" -i ./out_add -o out_add.pte

This PTE file can then be executed by building and using the test executor runner:

# Set up target build environment – host linux with mlsdk emulator
./setup.sh --disable-ethos-u-deps --enable-mlsdk-deps
source examples/arm/ethos-u-scratch/setup_path.sh

# Build the ExecuTorch Runtime
cmake --preset linux -DEXECUTORCH_BUILD_VULKAN=ON -DEXECUTORCH_BUILD_VGF=ON -DCMAKE_INSTALL_PREFIX=cmake-vgf -Bcmake-vgf
cmake --build cmake-vgf -j$(nproc) --target executor_runner

# Run the produced PTE file with the runtime example application
./cmake-vgf/executor_runner -model_path add_module_vgf.pte

For further details and many additional options to tailor the flow to your requirements, take a look at our ExecuTorch example notebooks.

Better still, these networks can be used directly in game engines for use cases such as NSS.

Next steps

We would like to invite developers to try out the VGF backend today. By doing so, you will be ready to target Arm neural technology as it arrives in upcoming Arm GPU generations.

To help you get started, here are some resources:

Explore training neural graphics uses cases using the Neural Graphics Model Gym.
Prepare for the next generation of ML hardware with the AI/ML SDK for Vulkan.
Explore the ExecuTorch Export and Device Runtime flows with ExecuTorch Arm backend examples.
To keep up with the latest development in ExecuTorch:
- Visit the ExecuTorch project on GitHub.
- See the Arm backend code.
- Learn more about the TOSA specification.
Explore example models:
- ExecuTorch example models.
- Arm model tests.
Visualize these models using the TOSA and VGF extensions to model explorer:
- TOSA adapter.
- VGF adapter.

AI blog

Ethos-U and Beyond: How ExecuTorch 1.0 powers AI at the edge

Per Åstrand

AI meets the edge: ExecuTorch 1.0 brings PyTorch performance and portability to Arm’s tiniest, most efficient devices.
- October 22, 2025
Arm neural technology in ExecuTorch 1.0

Robert Elliott

With the announcement of Arm neural technology, Arm is enabling neural networks and a new class of neural graphics capabilities.
- October 22, 2025
ExecuTorch 1.0 is here and with SME2 optimizations through KleidiAI

Gian Marco Iodice

Today marks an exciting milestone with the official general availability (GA) release of ExecuTorch 1.0, a lightweight, production-ready runtime from the PyTorch ecosystem.
- October 22, 2025

AI blog

Announcements

Architectures and Processors blog

Automotive blog

Embedded and Microcontrollers blog

Internet of Things (IoT) blog

Laptops and Desktops blog

Mobile, Graphics, and Gaming blog

Operating Systems blog

Servers and Cloud Computing blog

SoC Design and Simulation blog

Tools, Software and IDEs blog

Arm neural technology in ExecuTorch 1.0

Getting started

Next steps

Ethos-U and Beyond: How ExecuTorch 1.0 powers AI at the edge

Arm neural technology in ExecuTorch 1.0

ExecuTorch 1.0 is here and with SME2 optimizations through KleidiAI