Arm Ethos-N78 NPU: Unprecedented Machine Learning Capability at your Fingertips

May 26, 2020

3 minute read time.

Our everyday lives generate huge amounts of data and information – digital, biological, physical, and sensorial. With advances in AI, this data can be used to create incredible benefits for humankind. To realize this challenge and extract useful information, we need to be able to process this data when it is generated, where it generated. At Arm we are on a mission to enable machine learning (ML) on-device, allowing data to be processed, analyzed, and utilized in the real world. This helps provide several advantages for consumers – from enhanced security and privacy to more reliability and responsiveness.

With Arm, ML workloads will run on the Arm Cortex-A CPU – the world’s foremost ML processor, deployed in almost every smartphones in the world, and a vast array of other devices. However, ML can benefit from having a specialized Neural Processing Unit (NPU) which delivers an exponential improvement in performance and efficiency. An NPU’s processing capability enables the development of exciting new applications to enable true digital immersion. An example of this is a shopping application, which places virtual objects in a physical space or smart home-hubs with augmented reality (AR) story-times for kids. Beyond smartphones, NPUs can also enable a wide range of devices such as life-saving smart baby cameras which can monitor a baby’s breathing or temperature.

Presenting the Ethos-N78 Neural Processing Unit (NPU)

The Arm Ethos-N78 NPU is Arm’s highly scalable and efficient second-generation NPU, delivering ML on-device, and building on the success of the Arm Ethos-N77 NPU. The Ethos-N78 NPU is available in a range from 1 TOP/s to 10 TOP/s and supports a wide array of configurability.

Supporting over 90 unique configurations and allowing partners to configure the MACs, SRAM size and vector capability, Ethos-N78 provides unprecedented flexibility to our silicon partners. Our flexibility ensures partners can fine-tune their design to meet the optimum balance of performance, power, and area. In addition, the Ethos-N78 can be implemented across a wide range of devices with complete and transparent software compatibility and portability.

Ethos-N78: unprecedented flexibility for SoC architects.

Unprecedented flexibility for SoC architects

Doing more with less

The Ethos-N78 provides up to 30% more area efficiency than the previous generation, allowing partners to achieve more in less silicon area. While silicon area is an important cost metric, DRAM bandwidth is an equally precious resource in electronic systems. The Ethos-N78 has been specifically designed to use less DRAM bandwidth, consuming up to 40% less DRAM data per inference, allowing our partners to implement ML using less memory, further reducing system power and costs. The Ethos-N78 allows extensive use of ML in software applications while still ensuring long battery life.

Ethos-N78: advances in performance and efficiency.

Advances in performance and efficiency

Unified software and tools

Performant hardware is one part of the ML equation. It is equally important to have an efficient software stack for developers to deploy their chosen ML networks on the target hardware. The Ethos-N78 software stack provides a choice of two flows – an offline compilation flow based on the TVM compiler and an interpreted (on device or online) flow for use with Android NN API based on Arm NN. The offline and online flows work unified across all target Arm hardware IP (CPU, GPU, NPU) enabling a strategy of write once and deploy everywhere. Supporting all popular frameworks including TensorFlow, TensorFlow Lite, PyTorch and ONNX, among others, developers can continue working with their favorite framework.

The Ethos-N NPUs are supported by the Ethos-N Static Performance Analyzer tool ensuring developers can profile and tune their networks on the Ethos-N NPU well before availability of silicon and significantly reduce time-to-market.

Unified software stack for developers

Delivering Exciting Experiences

The diverse range of applications and devices using ML today require NPUs to be highly flexible and adaptable solutions to satisfy the wide variety of requirements. Use of on-device ML has advanced a long way in the last few years. Initially, on-device ML was adopted in mobile phones for functions that include face unlock and voice user interface. However, today we find ML usage expanding to cover new innovative uses – from delivering stunning photographs to cool AR-based applications. Beyond mobile, we find ML being extensively used in applications including HD security cameras, smart home-hubs, and DTV to deliver new features and user experiences. The Ethos-N78 NPU with its unmatched flexibility and advancements in performance and power-efficiency enables our partners to unleash the potential of ML on-device.

Learn more about Ethos-N78
Visit the Arm newsroom blog

1 comment
0 members are here

Architectures and Processors blog

Getting started with AARCHMRS Features.json using Python

Joh

A high-level introduction to the Arm Architecture Machine Readable Specification (AARCHMRS) Features.json with some examples to interpret and start to work with the available data using Python.
- April 8, 2025
Advancing server manageability on Arm Neoverse Compute Subsystem (CSS) with OpenBMC

Samer El-Haj-Mahmoud

Arm and 9elements Cyber Security have brought a prototype of OpenBMC to the Arm Neoverse Compute Subsystem (CSS) to advancing server manageability.
- January 28, 2025
Caches and Self-Modifying Code: Working with Threads

Jacob Bramley

How to synchronize JIT-compiled instructions across threads.
- January 21, 2025

AI blog

Announcements

Architectures and Processors blog

Automotive blog

Embedded and Microcontrollers blog

Internet of Things (IoT) blog

Laptops and Desktops blog

Mobile, Graphics, and Gaming blog

Operating Systems blog

Servers and Cloud Computing blog

SoC Design and Simulation blog

Tools, Software and IDEs blog

Arm Ethos-N78 NPU: Unprecedented Machine Learning Capability at your Fingertips

Presenting the Ethos-N78 Neural Processing Unit (NPU)

Doing more with less

Unified software and tools

Delivering Exciting Experiences

Getting started with AARCHMRS Features.json using Python

Advancing server manageability on Arm Neoverse Compute Subsystem (CSS) with OpenBMC

Caches and Self-Modifying Code: Working with Threads