Arm Community
Arm Community
  • Site
  • User
  • Site
  • Search
  • User
Arm Community blogs
Arm Community blogs
Internet of Things (IoT) blog Software, Tools, and Ecosystem for ML Edge Devices
  • Blogs
  • Mentions
  • Sub-Groups
  • Tags
  • Jump...
  • Cancel
More blogs in Arm Community blogs
  • AI blog

  • Announcements

  • Architectures and Processors blog

  • Automotive blog

  • Embedded and Microcontrollers blog

  • Internet of Things (IoT) blog

  • Laptops and Desktops blog

  • Mobile, Graphics, and Gaming blog

  • Operating Systems blog

  • Servers and Cloud Computing blog

  • SoC Design and Simulation blog

  • Tools, Software and IDEs blog

Tags
  • Keil MDK
  • Machine Learning (ML)
  • Cortex-M
  • Fixed Virtual Platforms (FVPs)
  • Arm Ethos-U processor
  • CMSIS
Actions
  • RSS
  • More
  • Cancel
Related blog posts
Related forum threads

Software, Tools, and Ecosystem for ML Edge Devices

Reinhard Keil
Reinhard Keil
July 17, 2024
8 minute read time.

Let’s talk about how Arm enables developers and the IoT software ecosystem to deliver smart, energy efficient ML edge devices. The IoT is steadily growing and new innovations in AI technology are mind-blowing. In the IoT line of business, we are working to scale AI innovations to tiny, constrained ML edge devices that are frequently powered by microcontrollers.

Generative AI and large language models have captured the public attention. We all have fun experimenting with an AI chat bot, but it's both brilliant and equally flawed. One thing is obvious, an AI chat box that just lives within the chat box window is limited. It's cut off from the real world.

By connecting AI to the real world, it gets interesting. Merging the technology with IoT endpoints creates synergy and new possibilities. It's clear to me that AI at the edge and IoT are going to come together to create the platforms of the future.

The IoT can provide real time data, and it can connect to the real world. It's the eyes, the ears, it's the touch and all the senses, and it's the muscles. But ML edge devices are more than just sensing and actuating. It takes localized control and regulates critical functions. In human biology, it’s called the autonomic nervous system, and its autonomous control keeps things alive even when cloud AI is taking a nap. The combination of cloud AI and ML edge devices will revolutionize many services and industries. And, to cope with real-time, power, and data constraints several of the smart tasks will be handled locally on ML edge devices.

Cortex-M and Ethos-U Processors for ML Edge Devices

Arm offers a broad range of optimized processors targeting ML applications on edge devices. Even the smallest Arm processor, the Cortex-M0/M0+, executes simple ML algorithms. Starting with Cortex-M4, the processors add hardware floating point arithmetic and SIMD instructions to accelerate DSP and ML algorithms.

The Helium vector extension on Cortex-M52, M55, or M85 boosts these algorithms further and enables applications such as speech keyword spotting or object and anomaly detection. And the Ethos series of Neural Processor Units (NPU) is a turbo co-processor for even more demanding applications such as smart cameras with real-time object classification.

Arm Processor Portfolio for ML Edge Devices

The diagram above shows typical use cases but selecting the right processor for an application can be a challenge. Fortunately, most ML models can be deployed to a variety of Arm processors and system architects may therefore initially focus on the software workloads. Still, it is important to understand memory and compute requirements on different platforms. Here the EEMBC/Spec AudioMark benchmark is helpful. It implements a typical audio pipeline with keyword spotting that you find in smart speakers. While individual ML algorithms operate significantly faster with Helium or Ethos-U, AudioMark lets you compare at the level of a complete application.

EEMBC AudioMark Benchmark Results

Developer support

Developing a complete ML edge device is a multi-year endeavor. To accelerate this process code reuse and early validation is key. Corstone IoT sub-systems help during SoC design with the right architecture choice, integration, and verification. The various Corstone systems are available as FPGA image and Arm Virtual Hardware simulation model and support both hardware architects and software developers during the whole design phase of an ML edge device.

The Corstone-315 (shown below) integrates Cortex-M85 along with an optional Ethos-U65 NPU and Arm Mali-C55 image signal processor (ISP) to build low-power, low-cost, high-performance secure endpoint AI devices that support convolutional neural networks (CNNs). CNNs are powerful artificial neural networks that are well-suited for image recognition, for example in smart cameras.

Corstone-315 IoT Sub-System

To provide you with the best experience for developing ML applications Arm offers solutions that cover hardware components, tools, and software to make product development easy and productive. The ML Developers Guide for Cortex-M Processors and Ethos-U NPU gives you an overview of the ML development process. It introduces you to the Arm technology and products that support ML development workflows from starting ML model training up to debugging on hardware.

Microcontrollers that are based on Cortex-M55, Cortex-M85, and Ethos-U are now hitting the mass market, and low-cost evaluation boards let you explore this modern technology. A few examples of such evaluation boards are:

  • Alif evalution kits based on Ensemble E7 with dual Cortex-M55, Ethos-U55, and Cortex-A32.
  • Seed Studio Grove Vision AI based on Himax WiseEye2 with dual Cortex-M55 and Ethos-U55.
  • Renesas EK-RA8M1 based on the RA8M1 microcontroller with Cortex-M85.

Standardized Software and Tools

The embedded market is fragmented as it addresses a diverse range of application specific use cases. However, many embedded systems contain similar building blocks, and the standardization of commonalities enables code reuse across many systems and simplifies the product lifecycle management. And Arm invests in software standardization, development tools, and ecosystem partnerships.

Development tools:

  • Arm Compiler for Embedded: commercial compiler with Helium auto-vectorization.
  • LLVM Embedded Toolchain for Arm: open-source compiler with Helium auto-vectorization.
  • Vela: compiles a TensorFlow Lite for Microcontrollers neural network model for Ethos-U.
  • ML Inference Advisor (MLIA): helps ML model designers to optimize for Ethos-U.
  • Arm Virtual Hardware based on Fixed Virtual Platforms (AVH FVP): accurate simulation models for software validation.

Software building blocks:

  • CMSIS: set of standardized software components, APIs, software frameworks, and foundational tools.
  • Platform Security Architecture (PSA): IoT Security Framework with reference implementations.

Tool suites:

  • Foundation Components for MLOps Systems: tools and software components for the overall development flow for machine learning applications.
  • Keil MDK Microcontroller Development Kit: comprehensive toolset for the development of IoT and ML applications on Cortex-M.

Many ecosystem partners such as ST, NXP, and IAR are utilizing CMSIS, PSA and other components in their development tools. ML Frameworks such as TensorFlow are validated for the Arm processor portfolio using Arm Compiler and AVH. And MLOps partners are now integrating the Arm foundation components into their MLOps systems.

Software development

The software and system design of a ML edge device can be separated into two parts:

  • The classic embedded IoT software requires efficient device drivers that interface with I/O peripherals, a communication stack with security, and firmware update services.
  • The system part that implements the machine learning algorithm. This ML part is frequently designed with an MLOps (Machine Learning Operations) system that is often a software-as-a-service (SaaS) cloud environments which is specialized for ML algorithm development.

Sensor, audio, or video inputs are typically converted into serial data streams for processing. Most ML applications process these data streams in several steps (see picture below). The signal conditioning and feature extraction is implemented using a resource optimized DSP front-end. The ML model gets as input this optimized data stream. Such an optimized ML processing pipeline is device agnostic and does not require the exact physical target. Software development uses therefore frequently simulation models or superset boards that offer more resources during test and validation. Once the implementation is tested on the Arm processor, it is relatively simple to adopt such a validated ML processing pipeline to a different target hardware.

ML Processing Pipeline

During development DSP developers and ML training requires real world data that is collected with sensors of the edge device. This data collection is supported by the Synchronous Data Stream (SDS) Framework that allows recording of real-world data and play back to AVH FVP during validation. CMSIS-Stream helps developers to design and optimization processing pipelines with multiple DSP algorithms. Combined, SDS and CMSIS-Stream are effective tools that support you during the development cycle. With AVH FVP you can analysis correctness and performance of each step in an ML processing pipeline.

  • Watch this webinar recording to learn more about the usage of SDS and CMSIS-Stream during the development process.

MLOps partners

The development of the ML classification or ML model itself is a complex task that is typically performed by domain experts and data analysts. Fortunately, the market offers many powerful ML models that target Arm ML edge devices. Arm works with several AI ecosystem partners to optimize ML models for a variety of typical applications.  Below are a few examples:

  • Baidu PaddlePaddle is a partner in China that works with Arm on an optimized MLOps platform.
  • EdgeImpulse offers a training platform and works on ML model optimization for edge devices.
  • Plumeria delivers pre-trained ML models for people detection and familiar face detection that are optimized for Arm processors and Ethos-U NPUs.
  • TDK Qeexo AutoML is a ML model and training platform for sensor applications on microcontrollers with an intuitive workflow (see below).

References

  • EdgeImpulse Keil MDK integration
  • Creating & Deploying AI Condition-Based Monitoring Solutions with Qeexo and Arm
  • Arm Community Blog: CMSIS v6 is here
  • Arm Community Blog: MDK v6 released
  • Arm Learning paths for microcontrollers and Ethos-U

Get the ML Developers Guide for Cortex-M Processors and Ethos-U NPU:

Get the guide

Anonymous
Internet of Things (IoT) blog
  • Building vision-enabled devices to capture the emerging wave in IoT

    Diya Soubra
    Diya Soubra
    IoT devices will drive an explosion in use cases with vision. Read more about the different use cases and what Arm technology is involved here.
    • December 9, 2024
  • The power of SystemReady for custom-built OS distributions

    Pere Garcia
    Pere Garcia
    Arm developed the SystemReady Devicetree band as part of the SystemReady program, learn more in this blog post.
    • November 22, 2024
  • Software, Tools, and Ecosystem for ML Edge Devices

    Reinhard Keil
    Reinhard Keil
    Learn how Arm and our Partners enable developers and the IoT software ecosystem to deliver smart, energy efficient ML edge devices.
    • July 17, 2024