AI is being rapidly adopted in edge computing. As a result, it is increasingly important to deploy machine learning models on Arm edge devices. Arm-based processors are common in embedded systems because of their low power consumption and efficiency. This tutorial shows you how to deploy PyTorch models on Arm edge devices, such as the Raspberry Pi or NVIDIA Jetson Nano.
Before you begin, make sure you have the following:
import torch import torchvision.models as models # Load a pre-trained model model = models.resnet18(pretrained=True) model.eval()
scripted_model = torch.jit.script(model) torch.jit.save(scripted_model, "resnet18_scripted.pt")
pip install torch torchvision
import torch print(torch.__version__) print(torch.cuda.is_available()) # Check if CUDA is supported (for devices like Jetson Nano)
scp resnet18_scripted.pt user@device_ip:/path/to/destination
import torch from PIL import Image from torchvision import transforms # Load the model model = torch.jit.load("resnet18_scripted.pt") model.eval() # Preprocess an input image preprocess = transforms.Compose([ transforms.Resize(256), transforms.CenterCrop(224), transforms.ToTensor(), transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]), ]) img = Image.open("test_image.jpg") img_tensor = preprocess(img).unsqueeze(0) # Add batch dimension # Perform inference with torch.no_grad(): output = model(img_tensor) print("Predicted class:", output.argmax(1).item())
from torch.quantization import quantize_dynamic quantized_model = quantize_dynamic( model, {torch.nn.Linear}, dtype=torch.qint8 ) torch.jit.save(quantized_model, "resnet18_quantized.pt")
import time start_time = time.time() with torch.no_grad(): for _ in range(100): output = model(img_tensor) end_time = time.time() print("Average Inference Time:", (end_time - start_time) / 100)
Example Dockerfile:
FROM python:3.8-slim RUN pip install torch torchvision pillow COPY resnet18_scripted.pt /app/ COPY app.py /app/ WORKDIR /app CMD ["python", "app.py"]
To deploy PyTorch models on Arm edge devices, you need to optimize the model, prepare the software, and use the right hardware. These steps help you deploy AI applications at the edge. This allows fast, efficient inference close to where the data is generated.