Arm Community
Arm Community
  • Site
  • User
  • Site
  • Search
  • User
  • Groups
    • Research Collaboration and Enablement
    • DesignStart
    • Education Hub
    • Innovation
    • Open Source Software and Platforms
  • Forums
    • AI and ML forum
    • Architectures and Processors forum
    • Arm Development Platforms forum
    • Arm Development Studio forum
    • Arm Virtual Hardware forum
    • Automotive forum
    • Compilers and Libraries forum
    • Graphics, Gaming, and VR forum
    • High Performance Computing (HPC) forum
    • Infrastructure Solutions forum
    • Internet of Things (IoT) forum
    • Keil forum
    • Morello Forum
    • Operating Systems forum
    • SoC Design and Simulation forum
    • 中文社区论区
  • Blogs
    • AI and ML blog
    • Announcements
    • Architectures and Processors blog
    • Automotive blog
    • Graphics, Gaming, and VR blog
    • High Performance Computing (HPC) blog
    • Infrastructure Solutions blog
    • Innovation blog
    • Internet of Things (IoT) blog
    • Operating Systems blog
    • Research Articles
    • SoC Design and Simulation blog
    • Tools, Software and IDEs blog
    • 中文社区博客
  • Support
    • Arm Support Services
    • Documentation
    • Downloads
    • Training
    • Arm Approved program
    • Arm Design Reviews
  • Community Help
  • More
  • Cancel
Arm Community blogs
Arm Community blogs
Architectures and Processors blog Arm ML Processor: Powering Machine Learning at the Edge
  • Blogs
  • Mentions
  • Sub-Groups
  • Tags
  • Jump...
  • Cancel
More blogs in Arm Community blogs
  • AI and ML blog

  • Announcements

  • Architectures and Processors blog

  • Automotive blog

  • Embedded blog

  • Graphics, Gaming, and VR blog

  • High Performance Computing (HPC) blog

  • Infrastructure Solutions blog

  • Internet of Things (IoT) blog

  • Operating Systems blog

  • SoC Design and Simulation blog

  • Tools, Software and IDEs blog

Tell us what you think
Tags
  • mobile
  • Mali
  • Neural Network
  • Artificial Intelligence (AI)
  • Machine Learning (ML)
  • Processors
Actions
  • RSS
  • More
  • Cancel
Related blog posts
Related forum threads

Arm ML Processor: Powering Machine Learning at the Edge

Ian Forsyth
Ian Forsyth
February 13, 2018
3 minute read time.

It would be really amazing to have a personal assistant in my hands that is actually smart, truly understands my words and responds intelligently to resolve day-to-day tasks. The recent advancements in Machine Learning (ML) make me optimistic that such a day is not too far away. ML has quickly moved from identifying cat pictures to solving real world problems significantly beyond the mobile market, in areas such as healthcare, retail, automotive and servers.

Now, the major challenge is to move this power to the edge and solve the privacy, security, bandwidth and latency issues that exist today. The Arm ML processor is a huge step in this direction.

Mobile Performance

The ML processor is a brand new design for the mobile and adjacent markets – such as smart cameras, AR/VR, drones, medical and consumer electronics – offering a performance of 4.6 TOP/s with an efficiency of 3 TOPs/W. Further compute and memory optimizations lead to a significant gain in performance for different networks.

The architecture consists of fixed-function engines, for the execution of convolution layers; and programmable layer engines, for executing non-convolution layers and implementing selected primitives and operators. The network control unit manages the overall execution and traversal of the network and DMA moves data in and out of the main memory. On-board memory allows central storage for weights and feature maps, reducing the traffic to external memory and, therefore, power.

Arm ML processor

Thanks to the presence of both fixed-function and programmable engines, the ML processor is extremely powerful, incredibly efficient and flexible enough to adapt to your future challenges, providing raw performance, along with the versatility to execute different neural networks effectively.

Key Features

  • Massive efficiency uplift from CPUs, GPUs, DSPs and accelerators
  • Enabled by open source software, so there’s no lock in
  • Closely integrated with existing software frameworks: TensorFlow, TensorFlow Lite, Caffe, Caffe 2
  • Optimized for use with Arm Cortex CPUs and Arm Mali GPUs

Arm ML architecture: flexible, scalable, futureproof

Trillium flexible scalable architecture

To tackle the challenges of multiple markets, with a wide range of performance requirements – from a few GOPs for IoT to tens of TOPs for servers – the ML processor is based on a new, scalable architecture.

The architecture can be scaled down to approximately 2 GOPs of performance for IoT or embedded level applications, or scaled up to 150 TOPs of performance for ADAS, 5G, or server-type applications. These multiple configurations can achieve many times the efficiency of existing solutions.

Compatible with existing Arm CPU, GPU and other IPs, providing a complete, heterogeneous system, the architecture will also be accessible through the popular ML frameworks, such as TensorFlow, TensorFlow Lite, Caffe and Caffe 2.

As more and more workloads move to ML, compute requirements will take on a wide variety of forms. Many ML use cases are already running on Arm, with our enhanced CPUs and GPUs providing a range of performance and efficiency levels. With the introduction of the Arm Machine Learning platform, we aim to extend that choice, providing a heterogeneous environment with the choice and flexibility required to meet each and every use case, enabling intelligent systems at the edge… and perhaps even the personal assistant I dream of.

Useful links

  • Arm NN SDK
  • Machine Learning on Arm - Frameworks Supporting Arm IP
  • Arm ML Processor
  • Compute Library
  • Cortex Microcontroller Software Interface Standard (CMSIS)

Related Machine Learning Resources

Related content

Anonymous
Architectures and Processors blog
  • What is new in LLVM 15?

    Pablo Barrio
    Pablo Barrio
    LLVM 15.0.0 was released on September 6, followed by a series of minor bug-fixing releases. Arm contributed support for new Arm extensions and CPUs.
    • February 27, 2023
  • Apache Arrow optimization on Arm

    Yibo Cai
    Yibo Cai
    This blog introduces Arm optimization practices with two solid examples from Apache Arrow project.
    • February 23, 2023
  • Optimizing TIFF image processing using AARCH64 (64-bit) Neon

    Ramin Zaghi
    Ramin Zaghi
    This guest blog shows how 64-bit Neon technology can be used to improve performance in image processing applications.
    • October 13, 2022