In March 2022, the Game Developers Conference (GDC), one of the biggest events for video game developers, was held in San Francisco, USA. In recent years, there has been important research presented that has driven Game AI in new directions, making it more lifelike, challenging, and enjoyable. This year we showcased a playable game demo, called ‘Dr Arm’ in which you can play against a game AI implemented using Unity Machine Learning Agents Toolkit (ML-Agents). We created this game demo to show how Machine Learning (ML) can benefit games on Arm platforms, and how game developers can use this technology creatively.
Our game shows that ML works comfortably on a wide range of Arm-based devices, including mobile devices as well as laptops. In this blog series, I will explain how we developed this demo. Part 1 provides a general overview of the demo. We hope that this blog series will interest many game developers in machine learning technology and encourage them to try it out in their game development projects.
Figure 1 : ML Agents demo in the Arm booth at GDC
The demo is ‘Dr Arm and The Machine learned Knight!’ a traditional one-on-one boss battle in a large castle chamber. In the game, attendees at GDC control the player character, Dr Arm. They aim to defeat the boss character, the Machine Learned Knight, which the game AI controls. Dr Arm is a character from the Mali Manga comics and appears in our in-house game. (You can download the full comics from here where you can also learn about our Mali GPU technologies).
Figure 2. Dr Arm from the Mali Manga comics (left) and screenshot of Arm’s In-house Game, Amazing Adventure of Dr Arm (right)
Let us see what the demo looks like.
Figure 3. Dr Arm’s boss battle demo
In this demo video, Dr Arm is manually controlled by me through a game controller. The game AI we implemented controls the Knight character. The red gauge in the bottom is the boss character’s Health (HP). The upper left gauges show the stats of Dr Arm. Similarly, the red gauge means Health. The blue shows Mana (MP), which is consumed when you throw a fireball. Health and Mana cannot be recovered during a battle. The green gauge shows Stamina, actions such as attacking with a sword and rolling will consume stamina, but it recovers over time. As you can see in the video, we prepared a game AI with different difficulty levels as Easy, Medium, and Hard. The game can be deployed to Windows on Arm (WoA) devices and Android mobiles.
Figure 4. Game screenshot
Alongside the playable demo, we showed that the game AI agent training can be run on a Windows on Arm device at GDC. At the time of writing, Unity Editor does not support Arm64 architecture, but Windows 11 supports x64 emulation. It means that the same setup as on an x64 machine works fine on Windows on Arm too. With that, the figure below shows that Unity Editor just runs without any issues and agents can be trained on a WoA laptop. We believe that WoA will be used in more fields.
Figure 5. Agent training running on Windows 11 on Arm at GDC
Unity’s ML-Agents allows game developers to train intelligent game AI agents within games and simulations. With this, you no longer need to “code” your characters. You can rather let them “learn” through Reinforcement Learning (RL). The basic mechanism used for ML-Agents has been explained in our previous blog post and is very well described in Unity’s official documentation. Unity also publishes several types of sample projects using ML-Agents, which can be used as a reference for implementation. In this blog series, we focus on how we implemented this game demo using ML-Agents.
In the case of our one-on-one boss battle demo, we want to train the Knight character. But the Knight encounters not only the environment dynamics but also Dr Arm, the human player. So, you can think of Dr Arm as being the environment. His actions influence the next state and the reward the Knight receives. It means that when we train the Knight agent we need Dr Arm to be the Knight’s opponent.
Figure 6. Agent contends not only with environment but also target
The relative strength of Dr Arm also influences the results of training. This means that if Dr Arm is too strong, it would be too difficult for the Knight agent to improve from scratch. On the other hand, if Dr Arm is too weak, the Knight learns to win – but they would be unable to compete with a stronger opponent. We need an opponent of roughly equal skill. It should be challenging, but not too challenging.
Additionally, since our agent improves with each new game, its opponent needs to improve as well. If we think about it, the Knight agent itself satisfies both requirements. Indeed, it is almost as skilled, and it also improves over time. If we do not consider Dr Arm as an agent, we would have to play it during training. It would not automate the training process and would require human effort and time. Furthermore, there is no guarantee that playing Dr Arm manually during training improves its skills with each game. So, it makes sense to have Dr Arm be trained as an agent too.
In summary, the first step to create the game is to train the two characters together as an agent. At this step, we give the two characters the same agent code, meaning they have the same Neural Network (NN) model structure. This agent code communicates with PyTorch’s backend through Unity’s communicator and ML-Agents API. Then they work together to train them. When they are smart enough, we change Dr Arm to be controllable by a game controller. At game-time the agent only controls the Knight and uses t he agent’s trained NN model.
Figure 7. Two phases for agents training and game play
This training setting is called Adversarial Self-Play. Adversarial Self-Play is a setting of two interacting agents with inverse reward signals. For example, as in zero-sum games, when one agent receives a positive reward, a competing agent receives a negative reward. It can allow an agent to become increasingly skilled by having the perfectly matched opponent. This was the strategy employed when training AlphaGo by DeepMind.
In part 2, I explore agents' design in more details. Check it out here.