Arm Community
Arm Community
  • Site
  • User
  • Site
  • Search
  • User
  • Groups
    • Research Collaboration and Enablement
    • DesignStart
    • Education Hub
    • Innovation
    • Open Source Software and Platforms
  • Forums
    • AI and ML forum
    • Architectures and Processors forum
    • Arm Development Platforms forum
    • Arm Development Studio forum
    • Arm Virtual Hardware forum
    • Automotive forum
    • Compilers and Libraries forum
    • Graphics, Gaming, and VR forum
    • High Performance Computing (HPC) forum
    • Infrastructure Solutions forum
    • Internet of Things (IoT) forum
    • Keil forum
    • Morello Forum
    • Operating Systems forum
    • SoC Design and Simulation forum
    • 中文社区论区
  • Blogs
    • AI and ML blog
    • Announcements
    • Architectures and Processors blog
    • Automotive blog
    • Graphics, Gaming, and VR blog
    • High Performance Computing (HPC) blog
    • Infrastructure Solutions blog
    • Innovation blog
    • Internet of Things (IoT) blog
    • Operating Systems blog
    • Research Articles
    • SoC Design and Simulation blog
    • Tools, Software and IDEs blog
    • 中文社区博客
  • Support
    • Arm Support Services
    • Documentation
    • Downloads
    • Training
    • Arm Approved program
    • Arm Design Reviews
  • Community Help
  • More
  • Cancel
Arm Community blogs
Arm Community blogs
AI and ML blog Arm Ethos-N ML Inference Processors: Powering Exciting User Experiences on Edge Devices
  • Blogs
  • Mentions
  • Sub-Groups
  • Tags
  • Jump...
  • Cancel
More blogs in Arm Community blogs
  • AI and ML blog

  • Announcements

  • Architectures and Processors blog

  • Automotive blog

  • Embedded blog

  • Graphics, Gaming, and VR blog

  • High Performance Computing (HPC) blog

  • Infrastructure Solutions blog

  • Internet of Things (IoT) blog

  • Operating Systems blog

  • SoC Design and Simulation blog

  • Tools, Software and IDEs blog

Tags
  • mobile
  • Neural Network
  • Artificial Intelligence (AI)
  • Machine Learning (ML)
  • Processors
Actions
  • RSS
  • More
  • Cancel
Related blog posts
Related forum threads

Arm Ethos-N ML Inference Processors: Powering Exciting User Experiences on Edge Devices

Ian Forsyth
Ian Forsyth
May 27, 2019
5 minute read time.

OK. Quick survey: How many connected devices do you own?

Whether you’re a gadget addict or just an average Josephine, I’m not sticking my neck out too far if I guess that you own more today than you did five years ago. From smartphones and tablets to personal fitness trackers, smart asthma inhalers and smart doorbells, we’re all busily increasing our connectivity year on year – along with our own personal data explosion. According to a recent report, in the last ten years, the global average of connected devices per capita has leapt from less than two to a projected 6.58 by 2020. That’s an awful lot of devices creating an awful lot of data.

Until recently, that data was routinely shipped to the cloud for processing. But as the amount of data and devices increase exponentially, it’s just not practical – not to mention secure or cost-effective – to keep shifting all that data back and forth.

Fortunately, recent advances in machine learning (ML) mean that more processing, and pre-processing, can now be done on-device than ever before. This brings a range of benefits, from increased safety and security, thanks to the reduced risk of data exposure, to cost and power savings. Infrastructure to transmit data to the cloud and back doesn’t come cheap, so the more processing that can be done on-device, the better.

Power and Efficiency Across the Performance Curve

On-device ML starts with the CPU, which acts as an adept ‘traffic controller’, either single-handedly managing entire ML workloads or distributing selected tasks to specialized Ethos-N processor.

Arm CPUs – and GPUs – are already powering thousands of ML use cases across the performance curve, not least for mobile, where edge ML is already driving features that consumers have come to expect as standard. (Bunny ear selfie, anyone?)

As these processors get ever-more powerful and efficient, they drive even higher performance, which enables more on-device compute power for secure ML at the edge. (See the launch of the third-generation DynamIQ ‘big’ core Arm Cortex-A77 CPU, for example, which can manage compute-intensive tasks without impacting battery life, and the Arm Mali-G77 GPU, which delivers a 60 percent performance improvement for ML.)

But while CPUs and GPUs are ML powerhouses in their own right, where the most intensive and efficient performance is required, they can struggle to meet requirements. For these tasks, the might of a dedicated neural processing unit (NPU), such as the Arm Ethos-N77, comes into its own, delivering the highest throughput and most efficient processing for ML inference at the edge.

NPU Drives New, Exciting User Experiences

So, what makes the Ethos-N77  so special? Well, it’s based on a brand-new architecture, targeting connected devices such as smartphones, smart cameras, augmented and virtual reality (AR/VR) devices and drones, as well as medical and consumer electronics. If you’re interested in how it stacks up numbers-wise, you can’t fail to be impressed by its outstanding performance of up to 4 TOP/s, enabling new use cases that were previously impossible due to limited battery life or thermal constraints. This enables developers to create new user experiences such as 3D face unlock or advanced portrait modes featuring depth control or portrait lighting.

Of course, superb performance is great – but not if it requires you to charge your device every couple of hours or drag a power bank with you wherever you go. To set users free from the tyranny of the charging cable, the ML processor boasts an industry-leading power efficiency of 5 TOPs/W – achieved through state-of-the-art optimizations, such weight and activation compression, as well as Winograd.

Winograd enables 225% greater performance on key convolution filters compared to other NPUs, in a smaller footprint, driving efficient performance while reducing the number of components required in any given design. This in turn lowers cost and power requirements without compromising on user experience.

The architecture consists of fixed-function engines, for the efficient execution of convolution layers, and programmable layer engines, for executing non-convolution layers and implementing selected primitives and operators. These natively supported functionalities are closely aligned with common neural frameworks to reduce network deployment costs allowing for a faster time to market.

Arm EthosEthos-N77 premium ML inference processor contains 16 compute engines
  • Efficiency: Provides a massive uplift from CPUs, GPUs, DSPs and accelerators of up to 5 TOPs/W
  • Network support: Processes a variety of popular neural networks, including convolutional (CNNs) and recurrent (RNNs), for classification, object detection, image enhancements, speech recognition and natural language understanding
  • Security: Executes with minimum attack surface using the foundation of Arm TrustZone architecture
  • Scalability: Scales, via multicore, up to eight NPUs and 32 TOPs in a single cluster or 64 NPUs in a mesh configuration
  • Neural framework support: Integrates closely with existing frameworks: TensorFlow, TensorFlow Lite, Caffe, Caffe 2 and others via ONNX.
  • Winograd convolution: Accelerates common filters by 225% compared to other NPUs, allowing more performance in less area
  • Memory compression: Minimizes system memory bandwidth through a variety of compression technologies
  • Heterogeneous ML compute: Optimized for use with Arm Cortex-A CPUs and Arm Mali GPUs
  • Enabled by open-source software: Supported by Arm NN to reduce cost and avoid lock-in

Futureproof and Versatile

To make life easy for developers, the Ethos-N77 has an integrated network control unit and DMA which manages the overall execution and traversal of the network, as well as moving data in and out of the main memory in the background.

Onboard memory allows central storage for weights and feature maps, reducing the traffic to external memory and so increasing battery life – another nod to the superlative user experience that consumers have come to expect as standard.

Crucially, the Ethos-N77  is flexible enough to support use cases with higher requirements, running an increased number and size of concurrent features: up to 8 cores can be configured in a single cluster achieving 32 TOP/s of performance, or up to 64 NPUs in a mesh configuration.

Ultimately, the Ethos-N77  boosts performance, drives efficiency, reduces network deployment costs and – through tight coupling of fixed-function and programmable engines – futureproofs the design, allowing firmware to be updated as new features are developed.

Through this combination of power, efficiency and flexibility, the Ethos-N77  is defining the future of ML inference at the edge, empowering developers to meet the requirements of tomorrow’s use cases whilst creating today’s optimal user experience. 

Learn more about Ethos-N processors

Anonymous
AI and ML blog
  • Analyzing Machine Learning models on a layer-by-layer basis

    George Gekov
    George Gekov
    In this blog, we demonstrate how to analyze a Machine Learning model on a layer-by-layer basis.
    • October 31, 2022
  • How audio development platforms can take advantage of accelerated ML processing

    Mary Bennion
    Mary Bennion
    Join DSP Concepts and Alif Semiconductor at Arm DevSummit 2022 to discuss ML techniques commonly used for audio. Discover the features and benefits of the Audio Weaver platform.
    • October 24, 2022
  • How to Deploy PaddlePaddle on Arm Cortex-M with Arm Virtual Hardware

    Liliya Wu
    Liliya Wu
    This blog introduces how to deploy a PP-OCRv3 English text recognition model on Arm Cortex-M55 processor with Arm Virtual Hardware.
    • August 31, 2022