Getting started with Machine Learning software on Corstone-3xx platform

February 14, 2025

6 minute read time.

This blog post describes the steps to develop tiny machine learning (ML) software by using TensorFlow Lite for Microcontrollers (TFLu) on the Corstone-3xx platform. It assumes that you have a basic knowledge of ML software and embedded-M software programming.

Before you begin:

This blog post assumes that you have the following tools available on your machine:

Visual Studio Code IDE with Arm Keil Studio Pack Extension: Toolset for the development of IoT and ML applications on the Corstone-3xx platform.
CMSIS: CMSIS pack is required for Corstone-3xx software programming.
Arm® Corstone -310 MPS3 based Fixed Virtual Platform (FVP): The platform to run the ML inference software image.
Ethos-U Vela compiler: An open-source Python tool, which can optimize a neural network model to run on an embedded system containing an Ethos-U NPU.

Develop TFLu based ML software on Corstone-3xx platform

The Corstone-3xx portfolio enables SoC designers to quickly develop IoT and endpoint AI devices based on the Arm V8-M processor and Arm Ethos-U NPU.

The following figure shows that the Corstone-3xx portfolio contains a configurable example subsystem, Arm System IP, software, and tools:

Diagram showing the Corstone-3xx portfolio

This section describes how to quickly develop and evaluate TFLu inference software on the Corstone-310 platform. Similarly, we can extend this method to other Corstone-3xx platforms easily.

The general TFLu-based ML software development processes on the Corstone-3xx platform are as follows:

Host (Offline) process:
1. Use the TensorFlow machine learning framework to build, train, and optimize the ML model. The resulting model is TFL flatbuffer file (*.tflite file). This post used a pre-trained model as an example.
2. Use the Vela Compiler to transform the model file into an optimized version that runs on an Ethos-U NPU.
Device deploy process:
1. Take the *.tflite file from the previous steps and execute it by using the TFLu runtime system.
2. The Ethos-U driver schedules operators to execute on the Ethos-U.
3. The CMSIS-NN library executes operators that cannot be mapped to the Ethos-U, by using a software implementation on the Cortex-M processor.

To start machine learning software in the Corstone-310 platform, perform the following steps:

Get started with Keil Studio Pack
Integrate the TFLu runtime software
Integrate the NPU driver library
Add the CMSIS-NN and DSP library
Add the ML model to project
Integrate the TFLu inference functions in application code

Step 1: Get started with Keil Studio Pack

The Arm® Keil® Studio Pack is a collection of Visual Studio Code extensions. The pack provides the software development environment for embedded systems and IoT software development on Arm-based microcontroller (MCU) devices.

Follow the instructions described in Create a solution to create a new solution in Visual Studio Code and CMSIS.

For the Corstone-310 project, you can select SSE-310 as shown in the following figure. You can select a proper Compiler toolchain and create the project.

SSE-310 diagram create the project

The following figure shows the structure of the created project:

Structure of created project SSE-310

Step 2: Integrate the TFLu runtime software

The TFLM software packs are available from the CMSIS-Pack web page. The Keil Studio provides Manage Software Components function to add the software packs into the project.

The required TFLM software packs are shown in the following figures:

Required TFLM software pack

equired TFLM software pack

Step 3: Integrate the NPU driver library

The optimized model includes TensorFlow Lite custom operators for the parts of the model that can be accelerated by the Ethos-U NPU.

Add the Ethos-U driver software pack into the project as shown in the following figure. The Corstone-310 platform uses U55.

Ethos-U driver software pack

Step 4: Add the CMSIS-NN and DSP library

The CMSIS-NN is a software library of neural network kernels that are optimized for various Arm Cortex-M processors. The model parts that cannot be accelerated by the NPU are left unchanged and run on the Cortex‑M series CPU by using an CMSIS-NN kernel.

Add the CMSIS-NN and DSP software packs as shown in the following figure.

CMSIS-NN and DSP software packs

Step 5: Add the ML model to the project

This blog post uses a pre-trained model created from the CIFAR-10 dataset and TensorFlow framework. The input to the model is a 32x32 pixel image with 3 color channels (RGB), which are classified into one of the 10 output classes.

To deploy your Neural Network (NN) model on Ethos-U, perform the following steps:

Use Vela to compile the trained model.

Vela is an open-source Python tool which can optimize a neural network model to run on an embedded system containing an Ethos-U NPU.

For details about Vela , see Vela Compiler: The first step to deploy your NN model on the Arm Ethos-U microNPU.

2. Convert the result model into the firmware image, which must be represented as an array in a C/C++ language syntax.

You can use the hexdump utility for example xxd to convert the binary .tflite file into a C/C++ source file that you can add to your project. For examples:

xxd -i cifar10_Int8_vela_H256.tflite > cifar10_Int8_vela_H256.cc

3. Add the .cc file to the project.

Step 6: Integrate the TFLu inference functions in application code

Here are the steps for how application code calls the TFLM inference APIs to run inference on the model.

1. Load the model

Application code calls tflite::GetModel(cifar10_Int8_tflite_vela) to load the model.

2. Set up the operation resolver

For the pre-trained CIFAR-10 model, the following snippet shows you how to set up an operation resolver for this model.

Fullscreen

1
2
3
4
5
6
7
8
9
10
11
12
13
op_resolver.AddConv2D();
op_resolver.AddMaxPool2D();
op_resolver.AddReshape();
op_resolver.AddFullyConnected();
op_resolver.AddSoftmax();
op_resolver.AddRelu();
op_resolver.AddEthosU();
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

op_resolver.AddConv2D();

op_resolver.AddMaxPool2D();

op_resolver.AddReshape();

op_resolver.AddFullyConnected();

op_resolver.AddSoftmax();

op_resolver.AddRelu();

op_resolver.AddEthosU();

The last one, Add EthosU, is to add TensorFlow Lite custom operators. TensorFlow Lite custom operators are for the model parts that can be accelerated by the Ethos-U NPU.

3. Create and initialize the MicroInterpreter

Fullscreen

1
2
tflite::MicroInterpreter interpreter(model, op_resolver, tensor_arena, kTensorArenaSize);
interpreter.AllocateTensors());
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

tflite::MicroInterpreter interpreter(model, op_resolver, tensor_arena, kTensorArenaSize);
interpreter.AllocateTensors());

4. Set input data

Application code calls the interpreter.input(0) function to get the input tensor and sets up proper input data for the model. The pretrained model supports 32*32 pixels with 3 bytes for RGB.

5. Invoke the Model

Application code calls interpreter.Invoke() to run inference on the model.

6.Get the output

Application code calls interpreter.output(0) to get the inference output.

Debug and Run the image in Keil Studio

After the project built successfully, perform the following steps:

1.Start a debug session in Keil Studio

Following the instructions described in Start a debug session to set up a debug session for project.

The following figure shows the debug session settings for the Corstone-310 FVP.

settings for the Corstone-310 FVP

2. Run the image in FVP

The pre-trained CIFAR-10 model is used for image classification. It can classify images into 10 classes, including:

Airplane
Automobile
Bird
Cat
Deer
Dog
Frog
Horse
Ship
Truck

The output from this model is a 1D tensor with a size of 10. Each element in this vector corresponds to the predicted probability that the input image belongs to one of the 10 categories.

The following figure shows the output text from inference on the CIFAR-10 model in the Corstone-310 FVP. The model inference classifies the input image as a Truck correctly.

CIFAR-10 model in the Corstone-310 FVP

The Corstone-3xx series and its related tools and ecosystem help you rapidly develop endpoint AI devices and deploy ML software on them.

Additional reading

This section lists relevant Arm publications for your reference:

0 comments
0 members are here

Tools, Software and IDEs blog

Binary interoperability between compilers

Paul Black

A quick look at the factors affecting binary interoperability between compilers, and linking a library built using one compiler into a project built with the another.
- March 5, 2025
Getting started with Machine Learning software on Corstone-3xx platform

Sue Wu

This blog post describes the steps to develop tiny Machine Learning (ML) software by using TensorFlow Lite for Microcontrollers (TFLu) on the Corstone-310 platform.
- February 14, 2025
Arm Toolchain for Embedded: next-generation Arm C/C++ embedded compiler

Paul Black

Arm is launching Arm Toolchain for Embedded (ATfE), an embedded C/C++ cross-compiler. The toolchain is expected to be launched in April 2025, but a beta version is available now.
- January 9, 2025

AI blog

Announcements

Architectures and Processors blog

Automotive blog

Embedded and Microcontrollers blog

Internet of Things (IoT) blog

Laptops and Desktops blog

Mobile, Graphics, and Gaming blog

Operating Systems blog

Servers and Cloud Computing blog

SoC Design and Simulation blog

Tools, Software and IDEs blog

Getting started with Machine Learning software on Corstone-3xx platform

Before you begin:

Develop TFLu based ML software on Corstone-3xx platform

Step 1: Get started with Keil Studio Pack

Step 2: Integrate the TFLu runtime software

Step 3: Integrate the NPU driver library

Step 4: Add the CMSIS-NN and DSP library

Step 5: Add the ML model to the project

Step 6: Integrate the TFLu inference functions in application code

Debug and Run the image in Keil Studio

Additional reading

Binary interoperability between compilers

Getting started with Machine Learning software on Corstone-3xx platform

Arm Toolchain for Embedded: next-generation Arm C/C++ embedded compiler