In part 2 of this blog series, we showed how the game AI agents were designed for our Candy Clash Demo. Part 3 looks at how the game runs on mobiles.
Up until now, I have mainly discussed how multi-agents work. I will now shift the discussion to the inference performance on Arm when this game is run on mobile devices. Let's start by looking at the performance when ML-Agents are executed.
Figure 1. ML-Agents execution time
The graph illustrates how the execution time varies as the number of agents increases. Light blue is the execution time when ML Agents are run on the CPU, while dark blue shows the execution time on the GPU. As you can see, the CPU is able to run ML Agents more quickly. This is because the ML Agents' NN model size is not large enough to utilize the GPU efficiently and data transfer between the CPU and GPU becomes a bottleneck. Additionally, the GPU is typically busy with graphics-intensive processing, which is another reason to favor using the CPU for ML Agents.
Next, I would like to introduce frame-interleaving inference, an implementation technique I used to run the ML agents to improve overall performance. This technique is commonly used to distribute processing over time, and I applied it in the same way for the ML agent execution.
Figure 2. Frame-interleaving inference
This diagram illustrates how the inference is performed frame by frame. The horizontal axis is time, with frames numbered from 1 to 8. We start with attackers on Team A at frame 1, followed by defenders and wanderers. We then switch to Team B from frame 4: attackers, defenders, and wanderers. We switch back to Team A again at frame 7. As a result, the NN model for one rabbit is executed every 6 frames. In this manner, we distribute the processing based on rabbit roles and teams in turn. The reason for this distribution is that models with the same weights can be executed in a batch. Batch execution is a mechanism that allows models with the same weights to run together, making the execution time more efficient than running multiple models separately. The image below compares screenshots from the Unity profiler when only the attacker role is executed in a single frame (top) and when all three roles are executed (bottom). As you can see, since each role has the same weight, the models are batch-executed for each role. This means even executing one rabbit model of a different role can cause significant overhead, suggesting that executing one role per frame is the most efficient distribution.
Figure 3. Comparison of model executions between one role (top) and all roles (bottom) in a single frame
The planner's execution is also distributed, but in a different way. The planner for both teams runs every frame, but only one rabbit's role is updated per frame. Instead of updating all the rabbits simultaneously, we distribute the number of rabbits to be updated over time.
The model execution interval can be easily set using DecisionPeriod parameter in DecisionRequester component provided by ML-Agents. However, it does not support frame-interleaving inference processing, so you need to implement it yourself. Without such processing, all rabbit models would be executed on the same frame. In other words, if the interval is set to 6, all rabbit models would be executed on frames 1, 7, 13, and so on. We modified the existing DecisionRequester code to implement this feature. Hopefully, we will soon further generalize this code and contribute it to the ML-Agents GitHub repository.
How effective is frame-interleaving inference? The graph below illustrates the changes in frame rate with and without this technique. The horizontal axis is the number of agents, and the vertical axis is the frame rate.
Figure 4. Frame-interleaving inference
As the number of agents increases, the amount of processing required increases, leading to a decrease in the frame rate.
There are several key considerations to keep in mind when deploying ML-Agents to mobile devices.
In conclusion, our exploration of multi-agent systems presents exciting potential for mobile gaming. We have demonstrated that carefully designed roles, dynamic strategies, and efficient use of computational resources can lead to complex and emergent behaviors in games. While there are challenges to be addressed – from ensuring efficient processing on mobile devices to fine-tuning the training of multiple models – the future is promising. With continual technological advancements, we look forward to even more compelling game experiences and improved performance in the mobile gaming space.
Nice work!!