Part 1: Unity ML-Agents on Arm and how we created game AI

July 8, 2022

6 minute read time.

Introduction

In March 2022, the Game Developers Conference (GDC), one of the biggest events for video game developers, was held in San Francisco, USA. In recent years, there has been important research presented that has driven Game AI in new directions, making it more lifelike, challenging, and enjoyable. This year we showcased a playable game demo, called ‘Dr Arm’ in which you can play against a game AI implemented using Unity Machine Learning Agents Toolkit (ML-Agents). We created this game demo to show how Machine Learning (ML) can benefit games on Arm platforms, and how game developers can use this technology creatively.

Our game shows that ML works comfortably on a wide range of Arm-based devices, including mobile devices as well as laptops. In this blog series, I will explain how we developed this demo. Part 1 provides a general overview of the demo. We hope that this blog series will interest many game developers in machine learning technology and encourage them to try it out in their game development projects.

Image showing booth at GDC

Figure 1 : ML Agents demo in the Arm booth at GDC

Dr Arm’s Boss Battle Demo

The demo is ‘Dr Arm and The Machine learned Knight!’ a traditional one-on-one boss battle in a large castle chamber. In the game, attendees at GDC control the player character, Dr Arm. They aim to defeat the boss character, the Machine Learned Knight, which the game AI controls. Dr Arm is a character from the Mali Manga comics and appears in our in-house game. (You can download the full comics from here where you can also learn about our Mali GPU technologies).
Image showing panel from Dr Arm comic and character in game

Figure 2. Dr Arm from the Mali Manga comics (left) and screenshot of Arm’s In-house Game, Amazing Adventure of Dr Arm (right)

Let us see what the demo looks like.

Figure 3. Dr Arm’s boss battle demo

In this demo video, Dr Arm is manually controlled by me through a game controller. The game AI we implemented controls the Knight character. The red gauge in the bottom is the boss character’s Health (HP). The upper left gauges show the stats of Dr Arm. Similarly, the red gauge means Health. The blue shows Mana (MP), which is consumed when you throw a fireball. Health and Mana cannot be recovered during a battle. The green gauge shows Stamina, actions such as attacking with a sword and rolling will consume stamina, but it recovers over time. As you can see in the video, we prepared a game AI with different difficulty levels as Easy, Medium, and Hard. The game can be deployed to Windows on Arm (WoA) devices and Android mobiles.

Screenshot from game

Figure 4. Game screenshot

Game Development on Windows on Arm

Alongside the playable demo, we showed that the game AI agent training can be run on a Windows on Arm device at GDC. At the time of writing, Unity Editor does not support Arm64 architecture, but Windows 11 supports x64 emulation. It means that the same setup as on an x64 machine works fine on Windows on Arm too. With that, the figure below shows that Unity Editor just runs without any issues and agents can be trained on a WoA laptop. We believe that WoA will be used in more fields.

Figure 5. Agent training running on Windows 11 on Arm at GDC

ML-Agents for a One-on-One Battle

Unity’s ML-Agents allows game developers to train intelligent game AI agents within games and simulations. With this, you no longer need to “code” your characters. You can rather let them “learn” through Reinforcement Learning (RL). The basic mechanism used for ML-Agents has been explained in our previous blog post and is very well described in Unity’s official documentation. Unity also publishes several types of sample projects using ML-Agents, which can be used as a reference for implementation. In this blog series, we focus on how we implemented this game demo using ML-Agents.

In the case of our one-on-one boss battle demo, we want to train the Knight character. But the Knight encounters not only the environment dynamics but also Dr Arm, the human player. So, you can think of Dr Arm as being the environment. His actions influence the next state and the reward the Knight receives. It means that when we train the Knight agent we need Dr Arm to be the Knight’s opponent.

Flowchart showing how the NN model completes a task

Figure 6. Agent contends not only with environment but also target

The relative strength of Dr Arm also influences the results of training. This means that if Dr Arm is too strong, it would be too difficult for the Knight agent to improve from scratch. On the other hand, if Dr Arm is too weak, the Knight learns to win – but they would be unable to compete with a stronger opponent. We need an opponent of roughly equal skill. It should be challenging, but not too challenging.

Additionally, since our agent improves with each new game, its opponent needs to improve as well. If we think about it, the Knight agent itself satisfies both requirements. Indeed, it is almost as skilled, and it also improves over time. If we do not consider Dr Arm as an agent, we would have to play it during training. It would not automate the training process and would require human effort and time. Furthermore, there is no guarantee that playing Dr Arm manually during training improves its skills with each game. So, it makes sense to have Dr Arm be trained as an agent too.

In summary, the first step to create the game is to train the two characters together as an agent. At this step, we give the two characters the same agent code, meaning they have the same Neural Network (NN) model structure. This agent code communicates with PyTorch’s backend through Unity’s communicator and ML-Agents API. Then they work together to train them. When they are smart enough, we change Dr Arm to be controllable by a game controller. At game-time the agent only controls the Knight and uses t he agent’s trained NN model.

Diagram showing the two phases of the NN model

Figure 7. Two phases for agents training and game play

This training setting is called Adversarial Self-Play. Adversarial Self-Play is a setting of two interacting agents with inverse reward signals. For example, as in zero-sum games, when one agent receives a positive reward, a competing agent receives a negative reward. It can allow an agent to become increasingly skilled by having the perfectly matched opponent. This was the strategy employed when training AlphaGo by DeepMind.

In part 2, I explore agents' design in more details. Check it out here.

Learn more with our On-demand session on this topic.

You can also try it out yourself, step by step with our online workshop.

0 comments
0 members are here

Mobile, Graphics, and Gaming blog

Optimizing 3D scenes in Godot on Arm GPUs

Clay John

In part 1 of this series, learn how we utilized Arm Performance Studio to identify and resolve major performance issues in Godot’s Vulkan-based mobile renderer.
- June 11, 2025
Bringing realistic clothing simulation to mobile: A new frontier for game developers

Mina Dimova

Realistic clothing simulation on mobile—our neural GAT model delivers lifelike cloth motion without heavy physics or ground-truth data.
- June 6, 2025
Join the Upscaling Revolution with Arm Accuracy Super Resolution (Arm ASR)

Lisa Sheckleford

With Arm ASR you can easily improve frames per second, enhance visual quality, and prevent thermal throttling for smoother, longer gameplay.
- March 18, 2025

AI blog

Announcements

Architectures and Processors blog

Automotive blog

Embedded and Microcontrollers blog

Internet of Things (IoT) blog

Laptops and Desktops blog

Mobile, Graphics, and Gaming blog

Operating Systems blog

Servers and Cloud Computing blog

SoC Design and Simulation blog

Tools, Software and IDEs blog

Part 1: Unity ML-Agents on Arm and how we created game AI

Introduction

Dr Arm’s Boss Battle Demo

Game Development on Windows on Arm

ML-Agents for a One-on-One Battle

Optimizing 3D scenes in Godot on Arm GPUs

Bringing realistic clothing simulation to mobile: A new frontier for game developers

Join the Upscaling Revolution with Arm Accuracy Super Resolution (Arm ASR)