Deploying PaddlePaddle models on Arm Ethos-U85: A step-by-step tutorial

October 1, 2025

4 minute read time.

Introduction to Arm Ethos-U85

AI data-processing workloads at the edge are transforming use cases and user experiences. Arm’s third-generation Ethos-U85 NPU helps meet the needs of future edge AI use cases. Ethos-U85 is the highest performing Ethos NPU. It addresses the growing demand for running advanced AI inference workloads at the edge, including transformer-based networks such as large language models (LLMs).

Arm also offers reference designs. For example, the Corstone-320 IoT reference design platform integrates Ethos-U85, among many items, to accelerate and simplify the chip development cycle. The reference design platform also includes a Fixed Virtual Platform (FVP). The FVP simulates an entire system, enabling cutting edge embedded software development and neural network deployment for Ethos-U85.

The example code in this technical blog post is tested on the Corstone-320 Fixed Virtual Platform (FVP). For more information and insights about Ethos-U85, the Corstone-320 reference design platform, and the Arm Fixed Virtual Platform (FVP), please visit Arm.com or developer.arm.com.

Getting started with PaddlePaddle on Ethos-U85

What do you get when you combine China’s premier open-source deep learning platform with Arm? Rocket fuel for innovation.

Arm has a long-standing partnership with Baidu. Together we accelerate the development of transformative edge AI solutions such as PaddlePaddle on embedded devices.

Through the partnership, Arm has worked with Baidu to deploy nine classical PaddleLite vision models on the Ethos-U85 NPU processor.

To date, the list of supported models includes:

Image Classification: PPLCNetV2, MobileNetV1
Object Detection: PicoDetV2
Face Detection: BlazeFace
Pose Detection: PP_TinyPose
Segmentation: HumanSegV2
Optical Character Recgonition: ch_ppocr_mobile_v2.0_det, ch_ ppocr_mobile_v2.0_rec, PicoDet_layout_1x

The Arm-Examples GitHub repository provides a full development environment with six example use cases. In this blog post, we show one example workflow deploying the “ch_ ppocr_mobile_v2.0_rec” model (for OCR use case) on the Ethos-U85 NPU. We also note considerations for deployment of other common models. For detailed technical guidance, please see the deployment guide for each model in the repository.

Before you begin, please ensure that your running environment configuration meets the following requirements:

Python 3.9
Cmake 3.21 or 3.22
Virtual Environment Creation Tool: venv (as used in this tutorial), Anaconda etc.
Ubuntu 20.04 LTS or 22.04

Step 1

Create a virtual operating environment for model training or deployment. Please note, some other models in the repository may require different training and deployment virtual environments due to the model fine-tuning process. For more details, please refer to the deployment guide for each model in the repository.

# Create virtual environment with Python 3.9
python3.9 -m venv ppocr_rec
source ppocr_rec/bin/activate
cd ppocr_rec

Step 2

Download the example code from GitHub and install the required packages.

# Download example source code
git clone https://github.com/Arm-Examples/Paddle-on-Ethos-U.git
cd Paddle-on-Ethos-U
git lfs pull
# Configure inference environment
bash install.sh

Step 3

Download PaddleLite models.

# Download ppocr_rec model
wget -O ./model_zoo/PpocrRec_infer_int8/ch_ppocr_mobile_v2.0_rec_slim_opt.nb
paddleocr.bj.bcebos.com/dygraph_v2.0/lite/ch_ppocr_mobile_v2.0_rec_slim_opt.nb

Use the conversion tool in the repository (write_model.py) to convert the model. The conversion process includes three main steps:

a) Convert the PaddleLite model (file with .nb extension) into an intermediate representation (IR) graph (.json file). The IR file is stored in the same directory as the input PaddleLite model automatically. (Known limitation: in this conversion situation, the --out_dir does not work)

# Convert .nb model into IR file (.json file)
python ./readnb/write_model.py --model_path ./model_zoo/PpocrRec_infer_int8/ch_ppocr_mobile_v2.0_rec_slim_opt.nb --out_dir . # "g_ch_ppocr_mobile_v2.0_rec_slim_opt.json" is generated under the same directory with input model file

b) Manually adjust the intermediate representation (IR) model. As the adjustment parts are relatively scattered, we provide a model patch. This patch finishes the adjustment quickly and improves the developer experience.

# Modify the IR file with patch quickly. You could also do this modification manually.
patch -p0 model_zoo/PpocrRec_infer_int8/g_ch_ppocr_mobile_v2.0_rec_slim_opt.json < readnb/test_asset/ppocr_rec/g_ch_ppocr_rec.patch

c) Optionally, use the conversion script again to convert the manually adjusted intermediate representation IR model into a TOSA diagram. Then use the compiler ethos-u vela provided by the Ethos-U official website. For more details about the ethos-u vela compiler, please check the introduction on PyPI or visit the technical documents on Arm Developer. You can also ignore this conversion step because it is already included in Step 5 automatically.

Step 5

Build the OCR application and check the inference results.

# Run inference
bash paddle_verify.sh -m ppocr_rec -p ./model_zoo/PpocrRec_infer_int8/test.jpg

An example of the test result is as follows:

telnetterminal0: Listening for serial connection on port 5000 
telnetterminal1: Listening for serial connection on port 5001 
telnetterminal5: Listening for serial connection on port 5002 
telnetterminal2: Listening for serial connection on port 5003 
handles.inputs->count is 1 
input tensor scratch_addr address  0x7c11f840 
input shapes 122880 
copy input data into  scratch_addr  
handles.outputs->io[x] shapes is 655360 
output tensor output_addr address  0x7c1bf840 
output shapes 655360 
output bin  [0x7c1bf840 655360]  
handles.outputs->count is 1 
 
Shape : 655360 
 
Rec Reuslut: 
Confidence: 0.966813 
============ NPU Inferences : 1 ============ 
Profiler report, CPU cycles per operator: 
ethos-u : cycle_cnt : 2083105832 cycles 
Operator(s) total: 574619648 CPU cycles 
Inference runtime: -987073648 CPU cycles total 
NOTE: CPU cycle values and ratio calculations require FPGA and identical CPU/NPU frequency 
Inference CPU ratio: 100.00 
Inference NPU ratio: 0.00 
cpu_wait_for_npu_cntr : 574619648 CPU cycles 
Ethos-U PMU report: 
ethosu_pmu_cycle_cntr : 2083105832 
ethosu_pmu_cntr0 : 479  
ethosu_pmu_cntr1 : 21  
ethosu_pmu_cntr2 : 118511  
ethosu_pmu_cntr3 : 0  
ethosu_pmu_cntr4 : 592  
Ethos-U PMU Events:[ETHOSU_PMU_SRAM_RD_DATA_BEAT_RECEIVED, ETHOSU_PMU_SRAM_WR_DATA_BEAT_WRITTEN, ETHOSU_PMU_EXT_RD_DATA_BEAT_RECEIVED, ETHOSU_PMU_EXT_WR_DATA_BEAT_WRITTEN, ETHOSU_PMU_NPU_IDLE] 
============ Measurements end ============ 
Running Model Exit Successfully 
Application exit code: 0. 
 
Info: /OSCI/SystemC: Simulation stopped by user. 
[run_fvp] Simulation complete, 0 Dump to out_tensors.bin

To deploy PaddlePaddle models on Arm-based edge AI devices, optimize the model, prepare the software, and use the right hardware. These steps help you deploy AI applications at the edge for fast, efficient inference close to where the data is generated.

Learn ore about deploying AI models onto Arm-based edge AI hardware use cases with our IoT Learning Paths:

Arm Developer Learning Paths

Internet of Things (IoT) blog

Deploying PaddlePaddle models on Arm Ethos-U85: A step-by-step tutorial

Liliya Wu

Build the future of edge AI: streamline PaddlePaddle deployment on Arm for performance where it matters most.
- October 1, 2025
Transforming smart home privacy and latency with local LLM inference on Arm devices

Fidel Makatia

Learn how Raspberry Pi 5 and Arm-based local LLM inference can power a fully private, cloud-free smart home assistant with real-time performance
- August 19, 2025
Kickstarting 2025 with the Arm Developer Workshop at KNUST

Derrick Edem Sosoo

We kicked off 2025 at KNUST with a hands-on Arm Developer Workshop focused on IoT, learning paths, and community-driven innovation.
- May 12, 2025

AI blog

Announcements

Architectures and Processors blog

Automotive blog

Embedded and Microcontrollers blog

Internet of Things (IoT) blog

Laptops and Desktops blog

Mobile, Graphics, and Gaming blog

Operating Systems blog

Servers and Cloud Computing blog

SoC Design and Simulation blog

Tools, Software and IDEs blog

Deploying PaddlePaddle models on Arm Ethos-U85: A step-by-step tutorial

Introduction to Arm Ethos-U85

Getting started with PaddlePaddle on Ethos-U85

Step 1

Step 2

Step 3

Step 5

Deploying PaddlePaddle models on Arm Ethos-U85: A step-by-step tutorial

Transforming smart home privacy and latency with local LLM inference on Arm devices

Kickstarting 2025 with the Arm Developer Workshop at KNUST