The cloud management software ecosystem has been built with x86 as the primary architecture. As a result, it is a challenge to deploy these tools on Arm and other non-x86 architectures. This post targets anyone interested in using and improving container management tools on multiple hardware architectures.
To explore and improve these multi-architecture issues in the ecosystem, we’ve ported an open-source cloud technology demo called Weavesocks to aarch64 (Arm64). Weavesocks simulates an e-commerce website that sells socks. We can deploy the demo with either Docker Swarm or Kubernetes, but this blog focuses on the Weavesocks deployment with Docker Swarm.
Using containers to run web services has become popular in the past few years. One significant reason for this is that Docker created an easy to learn and use ecosystem for building and running containers on a cluster. Docker image builds are automated with Dockerfiles. These files contain a list of commands specifying what to include in the image. Below, we show an example Dockerfile which specifies a container image that can build ‘go’ applications. Often, Dockerfiles mainly contain shell commands.
RUN apt-get update && apt-get install -y libpcap-dev \
python-requests time file shellcheck git golang \
&& rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*
ENV PATH /go/bin:/usr/local/go/bin:$PATH
ENV GOPATH /go
Aside from building images, Docker provides a container runtime engine, and a cluster manager called Swarm. This allows us to use Docker exclusively to deploy the Weavesocks demo on a cluster. Swarm allows us to provision a cluster with numerous machines, or nodes. After provisioning, we can deploy containers onto the cluster, and allow Swarm to load-balance the containers across the different nodes in the cluster. If a container crashes or a node goes offline, Swarm automatically redeploys the affected containers on the cluster. The details of how a cluster manager functions are outside the scope of this blog. However, it’s worth noting that cluster managers generally place emphasis on fault tolerance, high availability of services, resource management, latency, and security.
Weavesocks is a good representation of a production web service. Its deployment and management is very flexible. For example, tools like Docker Swarm, Kubernetes, or Marathon/Mesos can provision the cluster and orchestrate its containers. This flexibility makes Weavesocks an ideal platform to test, explore, and improve cloud technologies on Arm.
The demo is based on Microservices. A Microservice is a lightweight application that provides a single function/service. A collection of Microservices are networked to create a complete service, such as an e-commerce site like Weavesocks. Each Microservice in the demo is built into a Docker container for execution on a cluster. There are a total of 14 Microservices to manage features like user login, shopping carts, payments, etc. Additionally, various execution environments are used across the different Microservices. For example, some services run Java apps, others run Go apps, while others are databases like MongoDB and MySQL. The image below was created with an open source cluster visualization tool called Weavescope. It shows the logical connections between the Weavesocks Microservices. The figure also shows each Microservice labeled with its execution environment.
We deployed Weavesocks on various Arm based platforms including the Hardkernel Odroid-C2 (embedded), Softiron Overdrive 1000 (server class), and Marvell MacchiatoBin (networking). We’ve also deployed the demo on a multi-architecture cluster composed of both Intel Broadwell and Softiron Overdrive 3000 machines. The sizes of the tested clusters ranged between two and five machines. Just enough to ensure the main cluster features are functional. Porting the demo to Arm wasn’t difficult, but the porting process provided insight on good practices for multi-architecture support.
From left to right: Odroid-C2 (Quad-core A-53), Overdrive 1000 (Quadcore A-57), MacchiatoBin (Quadcore A-72)
Erroneous Linux kernel configuration can prevent proper function of tools like Docker. Platforms targeting the server market support EFI and PXE booting and tend to not have kernel configuration issues. This means we can install a standard release of a Linux distro directly from an .ISO or from the network. This approach gives us a kernel that is configured with everything needed to run Docker. On the other hand, embedded systems tend to have kernel configuration issues, since they often run a heavily modified Linux kernel configuration. We have seen embedded device kernel configurations that disable critical features like cgroups and namespaces, which are the primary resource control and isolation mechanisms needed to deploy containers. Thus, additional effort must be spent to build a kernel that supports running the Weavesocks demo on embedded platforms.
When kernel configuration prevents the use of a cloud manager like Docker or Kubernetes, often there are no error messages that hint at what could be wrong; things just don’t work. Symptoms of these issues can vary widely depending on the kernel feature that is missing. For example, the Docker daemon might fail to start, containers in a cluster might not communicate with each other, or overlay network encryption might not work. Often, when one kernel configuration issue is solved, another is discovered. Overall, it can be frustrating to sort through these issues.
There are a few ways to deal with kernel configurations issues. The first thing to check is whether the kernel configuration supports the features we know are needed. Some of the Linux features Docker needs are:
However, there may be features that we do not yet know are needed. In this case, we can try using the kernel configuration of a platform that is known to work with Docker. If we don’t have a known good configuration, trying an upstream kernel configuration is also a good option. Even better, if the platform supports EFI, we can try installing a standard release Linux distro to get the fully featured kernel that comes with it.
Base image selection is the biggest issue preventing easy multi-architecture support in Docker. The base image for a container is selected by using the “FROM” command. Base images are bound to a specific architecture, usually x86. Although Dockerfiles support variables that can be set at build time, these variables cannot be used in a “FROM” statement. This base image selection issue is a recognized problem, and there is already a proposed fix for it on github. The fix is to add a layer of abstraction through a manifest list. Instead of calling “docker run” or “docker service create” directly on the individual image, we point those commands to a manifest list, and this list points to images that are essentially the same, but built for different architectures. When a node tries to pull an image from this list, Docker is smart enough to pull an image that is compatible with the node’s architecture. Follow the 'Add manifest command' PR on github.
Until the above feature is merged into Docker, Dockerfile authors will either not bother with supporting multiple architectures, or they will find ways to deal with the issue externally to Docker. One way to handle this is to create multiple Dockerfiles, one per architecture. This works, but at the cost of redundancy since the only difference between these Dockerfiles are typically only the “FROM” statements. A better approach is to create a template Dockerfile. In this file, we place the build instructions for an architecture that is considered the default architecture for this image. Rather than calling “Docker build” on the Dockerfile, we call make or a shell script. This script handles the multi-architectural details outside of Docker. The script does the following:
Although this is not a very elegant solution, using external scripts to work around Docker limitations is a common practice.
Example flow of Makefile based solution for aarch64
Certain Dockerfile author practices can also hinder multi-architecture support. For instance, some authors store prebuilt binaries in a git repo along with their Dockerfile. The Dockerfile copies these prebuilt binaries into the container image. This practice locks down the Docker image to the specific architecture the binary was built for. A work around for this is to have the make/script select between binaries for different architectures. However, this requires multiple binaries to be stored in a repo; one per version/architecture, causing the size of our git repo to grow intolerably large after a few updates of the binaries -- this is because a git repo stores every version of the binary that was ever committed into the repo, even if these binaries are deleted from the repo.
An architecture agnostic and repo friendly alternative to this practice would be to install the binary at build time. Since Dockerfiles run shell commands, we can call a package manager such as apt, yum, or zypper, or we can call wget to install the binary (Update: The Dockerfile 'ADD' command can also be used in place of wget as well). We show an example of this below with a fictional Hello World binary. One last thing to notice with the wget method is that sometimes architecture-specific information could be embedded in the URL; this can be managed by using a Dockerfile argument.
# Copy binary from repo into the container
COPY helloworld /usr/local/bin/
# Run hello world on container startup
Requires binaries in the git repo, and locks the Dockerfile to the architecture of the binary
# Install v1.0.0 of hello world
RUN apt update \
&& apt install -y helloworld=1.0.0
# Run hello world on startup
Using a package manager to install the application allows for Dockerfile reuse across different architectures
RUN apk --update upgrade \
&& apk --no-cache --no-progress add ca-certificates \
&& apk add openssl \
&& rm -rf /var/cache/apk/*
# Install hello world v1.0.0 with wget
RUN wget https://<site>.com/release/$ARCH/$VER/helloworld
Using wget to download binaries from a releases server allows for Dockerfile reuse across different architectures
The package manager and wget methods work if there is a prebuilt binary hosted somewhere on the Internet. What about the case where a prebuilt binary is not available? In this case, we can take advantage of Docker’s multi-stage build system. This system allows us to run short-lived containers that build the artifacts we need, and then pass them into the final image. Although this method takes more work than installing prebuilt binaries, this is often the best way to wrap applications in a container for the following reasons:
The above work has given us insight on what it takes to improve cloud management tool deployments on Arm. Overall, we found that enabling containers to be architecture agnostic is a straight forward and relatively simple process. As the offering of servers becomes richer and goes beyond x86, we encourage Dockerfile authors and cloud management tool developers to follow practices that make their software easily deployable on all available architectures.
Cloud management is really widespread nowadays, it's used even in trading platforms, proof: https://www.ibm.com/downloads/cas/XREP2YPQ Ildar Sharipov won't risk his money if he didn't trusted IBM enough.