Updated AArch64 Docker images for PyTorch and TensorFlow

May 13, 2021

5 minute read time.

Docker images for TensorFlow and PyTorch running on Ubuntu 18.04 for Arm are now available. This article explains the details to build and use the Docker images for TensorFlow and PyTorch on Arm.

TensorFlow and PyTorch are two of the most popular machine learning frameworks. Both are seeing increased usage on Arm, ranging from smaller systems like the Raspberry Pi to larger systems for server and high-performance computing (HPC). Even though there is some support for AArch64 in packages already, users may want to compile everything from source. Reasons include using specific tools, targeting a different runtime environment, and experimenting with performance improvements from underlying libraries. Arm continues work to make ML on Arm well supported and to contribute optimizations to achieve the highest possible performance.

We hope these Docker images and the recipes to create them will be helpful for anybody looking to use TensorFlow and PyTorch on AArch64.

What is included?

Scripts to build an Ubuntu 18.04 based Docker image are available from the Arm tool-solutions repository on GitHub.

The finished TensorFlow and PyTorch images contain:

Ubuntu 18.04 user space for AArch64
Gnu gcc compiler 9.3.0
Maths libraries including Arm optimized routines and OpenBLAS 0.3.9

The TensorFlow image also contains a Python3 environment built from CPython 3.7 containing:

NumPy 1.17.1
TensorFlow 1.15.2 or TensorFlow 2.3.0 and SciPy 1.4.1 (depending on selected TensorFlow version)
TensorFlow Benchmarks
MLPerf

The PyTorch image also contains a Python3 environment built from CPython 3.8 containing:

NumPy 1.19.1
SciPy 1.5.2
PyTorch 1.6
MLPerf

Getting started

To build and run the Docker images make sure the machine being used is Arm AArch64.

Fullscreen

1
2
$ uname -m
aarch64
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

$ uname -m
aarch64

Any AWS EC2 instance powered by Graviton can be used to try out the images, including A1, T4g, M6g, C6g, or R6g. The TensorFlow and PyTorch images are significantly faster to build and run on Graviton2. Another thing to keep in mind is that builds done on Graviton2 will be optimized for the Neoverse-N1 and will not work on AWS A1 instances.

Installing Docker

Docker for Linux is recommended. Instructions on how to install Docker are available for various Linux distributions on the install page.

The summary to install git and Docker on Ubuntu for a username (ubuntu) is the following:

Fullscreen

1
2
3
4
5
$ sudo apt update
$ sudo apt upgrade -y
$ curl -fsSL get.docker.com -o get-docker.sh && sh get-docker.sh
$ sudo usermod -aG docker ubuntu ; newgrp docker
$ docker run hello-world
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

$ sudo apt update
$ sudo apt upgrade -y
$ curl -fsSL get.docker.com -o get-docker.sh && sh get-docker.sh
$ sudo usermod -aG docker ubuntu ; newgrp docker
$ docker run hello-world

Similar steps can be used for other Linux distributions.

Building images

Start by cloning the repository. Install git if needed.

Fullscreen

1
$ git clone https://github.com/ARM-software/Tool-Solutions.git
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

$ git clone https://github.com/ARM-software/Tool-Solutions.git

To use TensorFlow change to the tensorflow-aarch64/ directory:

Fullscreen

1
$ cd Tool-Solutions/docker/tensorflow-aarch64
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

$ cd Tool-Solutions/docker/tensorflow-aarch64

For PyTorch change directory to the pytorch-aarch64 directory:

Fullscreen

1
$ cd Tool-Solutions/docker/pytorch-aarch64
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

$ cd Tool-Solutions/docker/pytorch-aarch64

Each framework has a five stage Dockerfile so incremental progress can be saved and reused as needed.

The build.sh script builds images and has a help flag to review the options. The build-type flag is used to specify a specific set of images to build.

Look at the scripts and directory to see the details of the build steps. These can be modified as needed.

To build all images use:

Fullscreen

1
$ ./build.sh --build-type full
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

$ ./build.sh --build-type full

The default for TensorFlow is TensorFlow 1. To build TensorFlow 2 use the command-line option --tf_version 2. The images are tagged with -v1 or -v2 depending on the selected version of TensorFlow.

TensorFlow can optionally be built with oneDNN, using the '--onednn' or '--dnnl' flag. Without the '--onednn' flag, the default Eigen backend of Tensorflow is chosen. The BLAS backend for oneDNN can also be selected using the '--onednn' or '--dnnl' flags. Use -onednn reference' for the C++ reference kernelsand '--onednn openblas' for OpenBLAS.

Building TensorFlow is prone to running out of memory, but the bazel_memory_limit flag can be used to avoid exhausting available memory.

For example, to build TensorFlow 2 successfully on a machine with 32GB of memory put a limit such as:

Fullscreen

1
$ ./build.sh --build-type full --tf_version 2 --onednn --bazel_memory_limit 30000 --jobs 16
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

$ ./build.sh --build-type full --tf_version 2 --onednn --bazel_memory_limit 30000 --jobs 16

Once the images are built use the docker tag and push commands to save them in your favorite image repository. I use Docker Hub to save the images. Substitute your Docker ID when tagging the images.

Fullscreen

1
2
3
$ docker tag tensorflow-v2armpl jasonrandrews/tensorflow-v2armpl
$ docker login (login with username and password)
$ docker push jasonrandrews/tensorflow-v2armpl
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

$ docker tag tensorflow-v2armpl jasonrandrews/tensorflow-v2armpl
$ docker login (login with username and password)
$ docker push jasonrandrews/tensorflow-v2armpl

Next, let us see how to run the images.

Running images

MLCommons (MLPerf) are included in both TensorFlow and PyTorch images.

On any AArch64 machine with docker installed use the following commands to run a benchmark.

Fullscreen

1
2
3
$ docker pull jasonrandrews/tensorflow-v2armpl
$ docker tag jasonrandrews/tensorflow-v2armpl tensorflow-v2armpl
$ docker run -it --init tensorflow-v2armpl
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

$ docker pull jasonrandrews/tensorflow-v2armpl
$ docker tag jasonrandrews/tensorflow-v2armpl tensorflow-v2armpl
$ docker run -it --init tensorflow-v2armpl

Refer to the MLPerf GitHub area for more information on how to download datasets and models. Examples scripts are provided in the $HOME directory of the final image.

To run resnet50 on ImageNet min-validation dataset for image classification using TensorFlow:

Fullscreen

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
$ export DATA_DIR=${HOME}/CK-TOOLS/dataset-imagenet-ilsvrc2012-val-min
$ export MODEL_DIR=$(pwd)
$ ./download-model.sh
$ ./download-dataset.sh 
... output omitted
at the prompt select option 1:
 0) imagenet-2012-val-min-resized  Version 2012  (476be5741f52384f)
 1) imagenet-2012-val-min  Version 2012  (60cb8b2218445c36)
 2) imagenet-2012-val  Version 2012  (14db79a136d98dd4)
 3) dataset-imagenet-preprocessed-using-tensorflow (fac1d0d5f4e69a85)
 4) dataset-imagenet-preprocessed-using-pillow (a6a4613ba6dfd570)
 5) dataset-imagenet-preprocessed-using-opencv (4932bbdd2ac7a17b)
Please select the package to install [ hit return for "0" ]: 1
... more output omitted
$ cd ./inference/vision/classification_and_detection
$ ./run_local.sh tf resnet50 cpu
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

$ export DATA_DIR=${HOME}/CK-TOOLS/dataset-imagenet-ilsvrc2012-val-min
$ export MODEL_DIR=$(pwd)
$ ./download-model.sh
$ ./download-dataset.sh 
... output omitted
at the prompt select option 1:

 0) imagenet-2012-val-min-resized  Version 2012  (476be5741f52384f)
 1) imagenet-2012-val-min  Version 2012  (60cb8b2218445c36)
 2) imagenet-2012-val  Version 2012  (14db79a136d98dd4)
 3) dataset-imagenet-preprocessed-using-tensorflow (fac1d0d5f4e69a85)
 4) dataset-imagenet-preprocessed-using-pillow (a6a4613ba6dfd570)
 5) dataset-imagenet-preprocessed-using-opencv (4932bbdd2ac7a17b)

Please select the package to install [ hit return for "0" ]: 1
... more output omitted
$ cd ./inference/vision/classification_and_detection
$ ./run_local.sh tf resnet50 cpu

To run resnet34 on coco dataset for object detection using PyTorch pull the image and run.

Fullscreen

1
2
3
$ docker pull jasonrandrews/pytorch
$ docker tag jasonrandrews/pytorch pytorch
$ docker run -it --init pytorch
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

$ docker pull jasonrandrews/pytorch
$ docker tag jasonrandrews/pytorch pytorch
$ docker run -it --init pytorch

Fullscreen

1
2
3
4
5
6
$ export DATA_DIR=${HOME}/CK-TOOLS/dataset-coco-2017-val
$ export MODEL_DIR=$(pwd)
$ ./download-model.sh
$ ./download-dataset.sh 
$ cd ./inference/vision/classification_and_detection
$ ./run_local.sh pytorch ssd-resnet34 cpu
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

$ export DATA_DIR=${HOME}/CK-TOOLS/dataset-coco-2017-val
$ export MODEL_DIR=$(pwd)
$ ./download-model.sh
$ ./download-dataset.sh 
$ cd ./inference/vision/classification_and_detection
$ ./run_local.sh pytorch ssd-resnet34 cpu

Set the environment variable MKLDNN_VERBOSE=1 to verify the build uses oneDNN when running the benchmarks.

Summary

Docker images for TensorFlow and PyTorch on AArch64 are now available on Docker Hub to get and running quickly. The images use different tags to capture the build options previously described for using various libraries. The instructions reviewed also enable users to build custom images for these machine learning frameworks. We welcome any feedback to make them easier to use or to increase performance. Please file any issues in GitHub or open a pull request if you have ideas for improvement. Another good place to look for help is the Getting started with AWS Graviton project.

Visit Github

1 comment
0 members are here

Tools, Software and IDEs blog

GCC 15: Continuously Improving

Tamar Christina

GCC 15 brings major Arm optimizations: enhanced vectorization, FP8 support, Neoverse tuning, and 3–5% performance gains on SPEC CPU 2017.
- June 26, 2025
GitHub and Arm are transforming development on Windows for developers

Pareena Verma

Develop, test, and deploy natively on Windows on Arm with GitHub-hosted Arm runners—faster CI/CD, AI tooling, and full dev stack, no emulation needed.
- May 20, 2025
What is new in LLVM 20?

Volodymyr Turanskyy

Discover what's new in LLVM 20, including Armv9.6-A support, SVE2.1 features, and key performance and code generation improvements.
- April 29, 2025

AI blog

Announcements

Architectures and Processors blog

Automotive blog

Embedded and Microcontrollers blog

Internet of Things (IoT) blog

Laptops and Desktops blog

Mobile, Graphics, and Gaming blog

Operating Systems blog

Servers and Cloud Computing blog

SoC Design and Simulation blog

Tools, Software and IDEs blog

Updated AArch64 Docker images for PyTorch and TensorFlow

What is included?

Getting started

Installing Docker

Building images

Running images

Summary

GCC 15: Continuously Improving

GitHub and Arm are transforming development on Windows for developers

What is new in LLVM 20?