Arm Community
Arm Community
  • Site
  • User
  • Site
  • Search
  • User
Arm Community blogs
Arm Community blogs
AI blog Deploying PyTorch models on Arm edge devices: A step-by-step tutorial
  • Blogs
  • Mentions
  • Sub-Groups
  • Tags
  • Jump...
  • Cancel
More blogs in Arm Community blogs
  • AI blog

  • Announcements

  • Architectures and Processors blog

  • Automotive blog

  • Embedded and Microcontrollers blog

  • Internet of Things (IoT) blog

  • Laptops and Desktops blog

  • Mobile, Graphics, and Gaming blog

  • Operating Systems blog

  • Servers and Cloud Computing blog

  • SoC Design and Simulation blog

  • Tools, Software and IDEs blog

Tags
  • PyTorch
  • Artificial Intelligence (AI)
  • Edge Computing
  • Arm Developer Program
Actions
  • RSS
  • More
  • Cancel
Related blog posts
Related forum threads

Deploying PyTorch models on Arm edge devices: A step-by-step tutorial

Cornelius Maroa
Cornelius Maroa
April 22, 2025
2 minute read time.

AI is being rapidly adopted in edge computing. As a result, it is increasingly important to deploy machine learning models on Arm edge devices. Arm-based processors are common in embedded systems because of their low power consumption and efficiency. This tutorial shows you how to deploy PyTorch models on Arm edge devices, such as the Raspberry Pi or NVIDIA Jetson Nano.

Prerequisites

Before you begin, make sure you have the following:

  1. Hardware: An Arm-based device such as Raspberry Pi, NVIDIA Jetson Nano, or a similar edge device.
  2. Software
    • Python 3.7 or later must be installed on your device.
    • A version of PyTorch compatible with Arm architecture.
    • A trained PyTorch model.
  3. Dependencies: You must install libraries such as torch, torchvision, and other required Python packages.

Step 1: Prepare your PyTorch model

  • Train or load your model
    • Train your model on a development machine or load a pre-trained model from PyTorch’s model zoo:

import torch
import torchvision.models as models

# Load a pre-trained model
model = models.resnet18(pretrained=True)
model.eval()

  • Optimize the model
    • Convert the model to a TorchScript format for better compatibility and performance:

scripted_model = torch.jit.script(model)

torch.jit.save(scripted_model, "resnet18_scripted.pt")

Step 2: Set up the Arm edge device

  • Install Dependencies
    • Ensure your Arm device has Python installed.
  • Install PyTorch. Use a version specifically built for Arm devices. For example, Raspberry Pi users can use the following command:

pip install torch torchvision

  • Verify the Installation

import torch

print(torch.__version__)

print(torch.cuda.is_available()) # Check if CUDA is supported (for devices like Jetson Nano)

Step 3: Deploy the model to the device

  • Transfer the scripted model
    • Use scp or a USB drive to copy the model file (resnet18_scripted.pt) to the Arm device:

scp resnet18_scripted.pt user@device_ip:/path/to/destination

  • Run inference
    • Write a Python script to load the model and run inference:

 import torch
from PIL import Image
from torchvision import transforms

# Load the model
model = torch.jit.load("resnet18_scripted.pt")
model.eval()

# Preprocess an input image
preprocess = transforms.Compose([
    transforms.Resize(256),
    transforms.CenterCrop(224),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
])

img = Image.open("test_image.jpg")
img_tensor = preprocess(img).unsqueeze(0)  # Add batch dimension

# Perform inference
with torch.no_grad():
    output = model(img_tensor)
print("Predicted class:", output.argmax(1).item())

Step 4: Optimize for edge performance

  • Quantization
    • Use PyTorch’s quantization techniques to reduce the model size and improve inference speed:

from torch.quantization import quantize_dynamic



quantized_model = quantize_dynamic(

    model, {torch.nn.Linear}, dtype=torch.qint8

)

torch.jit.save(quantized_model, "resnet18_quantized.pt")

  • Leverage hardware acceleration
    • For devices with GPUs (e.g., NVIDIA Jetson Nano), ensure you’re using CUDA for accelerated computation.
    • Install the appropriate PyTorch version with GPU support.
  • Benchmark performance
    • Measure latency and throughput to validate the model’s performance on the edge device:

import time



start_time = time.time()

with torch.no_grad():

    for _ in range(100):

        output = model(img_tensor)

end_time = time.time()



print("Average Inference Time:", (end_time - start_time) / 100)

Step 5: Deploy at scale

  • Containerize the application
    • Use Docker to create a portable deployment environment.

Example Dockerfile:

 

FROM python:3.8-slim



RUN pip install torch torchvision pillow

COPY resnet18_scripted.pt /app/

COPY app.py /app/

WORKDIR /app



CMD ["python", "app.py"]

  • Monitor and update
    • Implement logging and monitoring to ensure your application runs smoothly.
    • Use tools like Prometheus or Grafana for real-time insights.

Conclusion

To deploy PyTorch models on Arm edge devices, you need to optimize the model, prepare the software, and use the right hardware. These steps help you deploy AI applications at the edge. This allows fast, efficient inference close to where the data is generated.

Anonymous
AI blog
  • Unlocking audio generation on Arm CPUs to all: Running Stable Audio Open Small with KleidiAI

    Gian Marco Iodice
    Gian Marco Iodice
    Real-time AI audio on Arm: Generate 10s of sound in ~7s with Stable Audio Open Small, now open-source and ready for mobile.
    • May 14, 2025
  • Deploying PyTorch models on Arm edge devices: A step-by-step tutorial

    Cornelius Maroa
    Cornelius Maroa
    As AI adoption in edge computing grows, deploying PyTorch models on ARM devices is becoming essential. This tutorial guides you through the process.
    • April 22, 2025
  • Updates in KleidiCV: Multithreading support and OpenCV 4.11 integration

    Mark Horvath
    Mark Horvath
    What's new with KleidiCV 0.2.0 and 0.3.0? Updates include new features and performance enhancements.
    • February 25, 2025