Arm has had a long history of research in distributed systems. In my last blog I wrote about how we have focused that research on applying Arm technology to High Performance Computing (HPC), but in the past four years we’ve also been looking at various forms of edge, fog, and in-network computing bridging sensors to supercomputers. Unlike supercomputing, distributed computing applications can be built across many different architectures, micro-architectures, and systems as can be seen in Figure 1.
Figure 1: Arm Edge to Cloud
As such, we wanted our continuous integration and deployment service to be able to simultaneously target all these heterogeneous environments. The principal service we were developing (depicted in Figure 2) was an edge computing cloud service built on top of Kubernetes, which orchestrates deployment of containers allowing developers targeting the edge to follow a similar pattern that they are already familiar with when developing and deploying to the cloud. We hosted the repository for both our applications and the underlying framework using GitLab, and so we wanted to utilise its continuous integration and deployment facilities to build and deploy Docker containers targeting multiple architectures.
Figure 2: Arm Research Edge Compute Project
This post details the underlying facilities we utilized to build multi-architecture images on top of GitLab’s provided services and clusters. We’ve attempted to craft the templates and base images in a way that others wanting to build containers targeting multiple architectures can use them directly. If you just want the instructions for doing this, jump to the end of the article – but I figured it would be worth explaining the building blocks of the solution for those that were interested.
Cross-compiling for multiple architectures has historically been problematic. While compiler families have long supported being built for cross-development, maintaining multiple versions of the compilers, libraries, and support binaries for different architectures tends to be cumbersome, time consuming, and error prone. Several years ago, shortly after I joined Arm, I discovered others had found a novel way to build cross-development and execution environments using the combination of Qemu’s user space emulation and Linux’s binfmt-misc functionality.
Binfmt-misc is a facility in Linux that can be used to identify runtime infrastructures for binaries based on their header. It is used to direct which shell binary should be used to interpret scripts, when to invoke special run times to execute Java, and when to use emulators to execute non-native binaries on a foreign architecture. This last feature can be used to instruct Linux to use user-space emulators (such as Qemu) to execute Arm binaries on x86 (or vice-versa). Qemu user space emulators exist for many architectures including x86, arm, power, risc-v, etc.
The challenge is that on its own, Qemu user emulation traditionally was only used for statically compiled binaries (since it wouldn’t know how to find shared libraries and other support binaries for a foreign architecture). However, clever developers realized with the right configuration of binfmt-misc (and statically compiled versions of the Qemu userspace emulator) one could chroot into a filesystem hierarchy for the foreign architecture. In other words, on an x86 environment you could execute within a Linux distro built for Arm simply by having the right Qemu-user-binfmt setup and unpacking a Arm distro someplace in your existing x86 filesystem. This configuration proved popular enough, that there were special meta-packages in Debian as well as other Linux distributions which would install the static Qemu-user emulators and configure binfmt-misc to use them.
When container environments were added to Linux, it made the whole process even easier. Since executing within a container has similar properties to chroot, having an appropriately configured binfmt-misc and Qemu available in the container namespace would grant the same facilities. The Docker4Mac desktop version even shipped with this capability when it was initially released in 2016.
While the docker environment supported running cross-architecture, Docker image formats didn’t originally specify their architecture and the Docker Hub used by many as the repository of built images had no means of differentiating architecture. Developers used specialized tags to mark architecture, which worked, but was an imperfect solution – particularly for containers which wanted to target several architectures like the ones we were developing. The proposed solution to this was to add a manifest capability into the Docker image format which could be used as a sort of meta-image with pointers to architecture specific images for any tag. At the time of the publishing of this blog, the manifest support is still classified as experimental, but you can enable it with an environment variable and use it to inspect images to see if they support multiple architectures or operating systems:
erivan01@nanopc-t4-1:~$ docker manifest docker manifest is only supported on a Docker cli with experimental cli features enabled erivan01@nanopc-t4-1:~$ export DOCKER_CLI_EXPERIMENTAL=enabled erivan01@nanopc-t4-1:~$ docker manifest inspect ubuntu:latest | grep architecture "architecture": "amd64", "architecture": "arm", "architecture": "arm64", "architecture": "386", "architecture": "ppc64le", "architecture": "s390x",
The Docker command line interfaces and eventually the Docker Hub were upgraded to work with these manifests allowing more convenient packaging and deployment of multi-architecture images but building these multi-architecture images was still cumbersome.
Availability of desktop Arm development systems has long been a sore spot in the ecosystem. Even with Arm based cloud resources available from Packet, Amazon, and others, developers still yearned for the ability to develop and test on their laptops or desktops. Arm partnered with Docker to facilitate broader capability for cross-development, producing first-class cross-architecture build capabilities in the BuildX interface to Docker’s BuildKit which not only built for multiple architectures simultaneously, but would handle creation of the manifest and pushing all images to the Docker registry. Docker BuildX was originally bundled into desktop versions for Mac and Windows, but is easy to acquire and use on Linux as well.
Continuous integration provides cluster facilities to automatically build and test code committed to source code repositories. GitLab has built-in facilities for continuous integration, and provides runners for many architectures on Linux which allow users to provide build resources for different architectures. The best path is to install from repositories – this seems to work for 64-bit Arm on Ubuntu 18.04 with some slight modification. These are the original instructions, but some fix up is required - the following snippet handles it for you:
#!/bin/bash if ! [ -x "$(command -v docker)" ]; then sudo apt-get install -y -qq docker.io fi if ! [ -x "$(command -v gitlab-runner)" ]; then sudo curl -s https://packages.gitlab.com/install/repositories/runner/gitlab-runner/script.deb.sh | sudo bash sudo apt-get install -y -qq cdebootstrap gitlab-runner qemu-user-static binfmt-support sudo usermod -aG docker gitlab-runner sudo systemctl restart gitlab-runner sudo sed -i -e 's/stable \.\/debian-minbase/stretch \.\/debian-minbase http:\/\/deb.debian.org\/debian/g' /usr/lib/gitlab-runner/mk-prebuilt-images.sh sudo /usr/lib/gitlab-runner/mk-prebuilt-images.sh fi
Or you can just run the snippet directly:
curl -s https://gitlab.com/snippets/1899384/raw | bash -x
Once the runner is installed, you register it using gitlab-ci-multi-runner register and input the token from your GitLab project/group CI/CD settings tab (Figure 3):
root@nanopi-m4-7:~# gitlab-ci-multi-runner register --docker-privileged Running in system-mode. Please enter the gitlab-ci coordinator URL (e.g. https://gitlab.com/): https://gitlab.com/ Please enter the gitlab-ci token for this runner: 8mprXEysyV7pSVsbB7a3 Please enter the gitlab-ci description for this runner: [nanopi-m4-7]: Please enter the gitlab-ci tags for this runner (comma separated): Whether to lock the Runner to current project [true/false]: [true]: Registering runner... succeeded runner=dazz_PbZ Please enter the executor: docker+machine, docker-ssh+machine, docker, docker-ssh, shell, kubernetes, parallels, ssh, virtualbox: docker Please enter the default Docker image (e.g. ruby:2.1): alpine:latest Runner registered successfully. Feel free to start it, but if it's running already the config should be automatically reloaded! root@nanopi-m4-7:~#
If you want to make sure your runner is used, you should disable shared runners on the project/group and/or use a tag (both when registering the runner and within your .gitlab-ci.yaml file).
Figure 3: GitLab CI/CD runner configuration
Gitlab runner and CI/CD documentation can give you more details on this process if you are unfamiliar. However, GitLab’s own cluster resources for continuous integration are x86_64 and so if you wanted to use their cluster x86_64 was the only architecture that you could target.
GitLab CI does have several methods supported for building Docker images, the most common of which is to use Docker in Docker, where Docker itself is run inside a Docker container.
Figure 4: Docker-in-Docker Configurations
After the release of BuildKit support for multiple architecture, Jono Hill realized he could use a variation of GitLab CI Docker-in-Docker to build containers for multiple architectures using GitLab’s own cluster resources. He provided a build recipe and pre-built docker images on Docker Hub which could be used to do this. We based our solution on Jono’s recipe and are in the process of incorporating it into a variation of GitLab’s Docker template and auto-dev-ops pipeline. The resulting stack can be run either GitLab’s cluster or on user provided cluster resources. I’ve made a few changes to the original recipe and provided multi-arch versions of the pre-build docker images so you can now run on either x86_64 or arm64 (and generate manifest-enabled multi-arch images for both the architectures in addition to arm/v7, arm/v6 and potentially for any other architecture supported by qemu-user-static).
The simple method for enabling multi-arch Docker builds within GitLab, is to use or include our simple template which uses a build image containing Qemu, Docker, and BuildX. There are two build targets, one for master and one for other tags – the only difference is the master branch will create a ‘latest’ tag image and the others will use a git commit id for the tag. Which architectures to target are based on a GitLab CI variable – defaults are provided in the template, but can be overridden by variables provided during pipeline run or by configuration within the repository. Since the base image is itself multi-arch it can be run either on x86_64 or arm64 (the scripts still need a little more tweaking to generate the BuildX base image on armv7 or armv6). Simply place the following code in your .gitlab-ci.yml in the root directory of your git repository:
variables: CI_BUILD_IMAGE: "registry.gitlab.com/ericvh/docker-buildx-qemu" CI_BUILDX_ARCHS: "linux/amd64,linux/arm64" .build: image: $CI_BUILD_IMAGE stage: build services: - name: docker:dind entrypoint: ["env", "-u", "DOCKER_HOST"] command: ["dockerd-entrypoint.sh"] variables: DOCKER_HOST: tcp://docker:2375/ DOCKER_DRIVER: overlay2 # See https://github.com/docker-library/docker/pull/166 DOCKER_TLS_CERTDIR: "" retry: 2 before_script: - | if [[ -z "$CI_COMMIT_TAG" ]]; then export CI_APPLICATION_REPOSITORY=${CI_APPLICATION_REPOSITORY:-$CI_REGISTRY_IMAGE/$CI_COMMIT_REF_SLUG} export CI_APPLICATION_TAG=${CI_APPLICATION_TAG:-$CI_COMMIT_SHA} else export CI_APPLICATION_REPOSITORY=${CI_APPLICATION_REPOSITORY:-$CI_REGISTRY_IMAGE} export CI_APPLICATION_TAG=${CI_APPLICATION_TAG:-$CI_COMMIT_TAG} fi - echo "$CI_REGISTRY_PASSWORD" | docker login -u "$CI_REGISTRY_USER" --password-stdin $CI_REGISTRY build:buildx-master: extends: .build only: refs: - master variables: - $CI_BUILDX_ARCHS script: # Use docker-container driver to allow useful features (push/multi-platform) - update-binfmts --enable # Important: Ensures execution of other binary formats is enabled in the kernel - docker buildx create --driver docker-container --use - docker buildx inspect --bootstrap - docker buildx ls - docker buildx build --platform $CI_BUILDX_ARCHS --progress plain --pull -t "$CI_REGISTRY_IMAGE" --push . build:buildx: extends: .build only: variables: - $CI_BUILDX_ARCHS except: refs: - master script: - update-binfmts --enable # Important: Ensures execution of other binary formats is enabled in the kernel - docker buildx create --driver docker-container --use - docker buildx inspect --bootstrap - docker buildx ls - docker buildx build --platform $CI_BUILDX_ARCHS --progress plain --pull -t "$CI_APPLICATION_REPOSITORY:$CI_APPLICATION_TAG" --push .
Alternatively, you can just include it directly and optionally override variables such as CI_BUILD_ARCHS:
include: - project: 'ericvh/gitlab-ci-arm-template' file: '/.gitlab-ci.yml' variables: CI_BUILDX_ARCHS: "linux/arm64,linux/amd64"
One downside of the current approach is that an emulated build takes on the order of ten times longer to complete over a native build, and while this gives more time for sword fighting, it is hardly an ideal scenario. End-users can provide their own native hardware as builders which is probably a shorter path to more rapid build times. If users do not have their own Arm server hardware, they can try these methods on cloud services supporting Arm hardware such as Amazon or Packet. In order to support user's who have access to hardware (either their own systems or in the cloud), I've also added a number of rules to the template mentioned above which will build natively on runners for each architecture and then create a manifest from the architecture-specific images. Since user's may not have both 64-bit and 32-bit arm hardware available, you can select which platforms to build for (other than amd64) with variables included either in your .gitlab-ci.yml or when executing the runner. For this to work, we tag the runners as we mentioned in the Continuous Integration section -- if you forgot to tag the runner when you registered it, you can use the edit function in GitLab's CI/CD settings to add or change tags. If you used tagged runners, make sure you keep GitLab's shared runners enabled so you can take advantage of them to build the amd64 version of the image.
Figure 5: Tagged Runners in GitLab
Here is the additional template code if you want to add it directly (it depends on elements from the BuildX configuration so make sure you include both):
# make sure you include variables and .build from the buildX example above build:arm64: extends: .build only: variables: - $CI_BUILD_ARM64 except: variables: - $CI_BUILDX_ARCHS tags: - arm64 script: - docker build -t "$CI_APPLICATION_REPOSITORY/arm64:$CI_APPLICATION_TAG" . - docker push "$CI_APPLICATION_REPOSITORY/arm64:$CI_APPLICATION_TAG" build:amd64: extends: .build except: variables: - $CI_AMD64_DISABLED - $CI_BUILDX_ARCHS script: - docker build -t "$CI_APPLICATION_REPOSITORY/amd64:$CI_APPLICATION_TAG" . - docker push "$CI_APPLICATION_REPOSITORY/amd64:$CI_APPLICATION_TAG" build:manifest: extends: .build stage: deploy except: variables: - $CI_BUILDX_ARCHS script: - echo "Checking amd86 build..." && [[ -z $CI_AMD64_DISABLE ]] && echo "found" && export CI_MANIFEST_LIST="$CI_APPLICATION_REPOSITORY/amd64:$CI_APPLICATION_TAG" - echo "Checking arm64 build..." && [[ $CI_BUILD_ARM64 ]] && echo "found" && export CI_MANIFEST_LIST="$CI_MANIFEST_LIST $CI_APPLICATION_REPOSITORY/arm64:$CI_APPLICATION_TAG" - export DOCKER_CLI_EXPERIMENTAL=enabled - echo $CI_MANIFEST_LIST - docker manifest create $CI_APPLICATION_REPOSITORY:$CI_APPLICATION_TAG $CI_MANIFEST_LIST && docker manifest push $CI_APPLICATION_REPOSITORY:$CI_APPLICATION_TAG - docker manifest create $CI_APPLICATION_REPOSITORY:latest $CI_MANIFEST_LIST && docker manifest push $CI_APPLICATION_REPOSITORY:latest - echo "Checking master" && [[ $CI_COMMIT_REF_NAME == "master" ]] && docker manifest create $CI_REGISTRY_IMAGE:latest $CI_MANIFEST_LIST && docker manifest push $CI_REGISTRY_IMAGE:latest
Alternatively, you can just include it directly and hard-set which architectures to build:
include: - project: 'ericvh/gitlab-ci-arm-template' file: '/.gitlab-ci.yml' variables: CI_BUILD_ARM64: "enabled"
You'll see the manifest actually gets built during the deploy phase to make sure any previous builds have completed. In this case, the native arm64 builder actually builds quicker than the amd64 builder (probably due to dedicated resources).
Figure 6: GitLab Pipeline View
Current versions of the template are maintained on GitLab under the http://gitlab.com/ericvh/gitlab-ci-arm-template project. If you'd like to see an example of simply including the template, please take a look at http://gitlab.com/ericvh/fluentbit. Beyond our simple template, we are working on a version which integrates with GitLab’s richer auto-dev-ops template. This article only focuses on build, but there is also deployment. GitLab has Kubernetes and Function as a Service deployment capabilities, but both are currently x86 only due to relying on x86-specific images in the automated recipes. We have customer deployment scripts which can target Arm and are looking at how these might be better integrated with the richer set of GitLab templates.
Thanks to Josh Minor, Luis Pena, Jon Hermes and Alexandre Ferreira who contributed in various ways to figuring out Arm builds on GitLab pipelines.