Arm Community
Arm Community
  • Site
  • User
  • Site
  • Search
  • User
Arm Community blogs
Arm Community blogs
Mobile, Graphics, and Gaming blog Making Gaming Faster with Updatable Mali GPU Drivers and Android GPU Inspector
  • Blogs
  • Mentions
  • Sub-Groups
  • Tags
  • Jump...
  • Cancel
More blogs in Arm Community blogs
  • AI blog

  • Announcements

  • Architectures and Processors blog

  • Automotive blog

  • Embedded and Microcontrollers blog

  • Internet of Things (IoT) blog

  • Laptops and Desktops blog

  • Mobile, Graphics, and Gaming blog

  • Operating Systems blog

  • Servers and Cloud Computing blog

  • SoC Design and Simulation blog

  • Tools, Software and IDEs blog

Tell us what you think
Tags
  • Mali GPU Tools
  • vulkan
  • High Fidelity Mobile Gaming
  • Profiling
Actions
  • RSS
  • More
  • Cancel
Related blog posts
Related forum threads

Making Gaming Faster with Updatable Mali GPU Drivers and Android GPU Inspector

mattyclarkson
mattyclarkson
June 16, 2020
6 minute read time.

Every optimization matters in gaming. Optimizations can lead to better frame rates, higher-quality models, more beautiful pixels, and better battery life, which means longer playing sessions and better looking games. The utilization of the GPU is paramount to pushing the best quality at the fastest framerates. Arm has implemented support for updateable drivers and  Android GPU Inspector to supercharge gaming on devices with Mali GPUs.

Currently, Android devices receive GPU drivers via over-the-air firmware images. Due to the firmware needing to be extremely stable, full system over-the-air updates may only occur one or two times a year. Between those releases, the opportunity to fix bugs and optimize the driver is reduced to the over-the-air update time slots. If a game developer finds a bug in a driver, they will have to wait until the next over-the-air update on an Android device for the fix to occur. Arm is continuously optimizing the driver for Mali GPUs, but the delivery of those updates can only occur during the over-the-air updates.

A mainstay of the PC gamer experience is receiving new drivers for their GPUs that enable optimizations, new features and more stable experiences. Android updateable drivers enable the same experience for Mali GPU device users. Updates are delivered via the Google Play store without having to receive a full over-the-air update to their devices: an easy and familiar installation process.

A bug reported by a game developer can be fixed and then pushed to the updateable driver beta channel for testing before being delivered through the Google Play store. Once the bug is fixed, Arm can promote the update to the stable channel. Gamers can then benefit from the extra stability of the driver. As Arm implement optimizations in the driver, they can be continuously delivered to gamers, improving the gaming experience.

Future Mali GPU drivers contain support for Android GPU Inspector, which was recently announced by Google. This is an open-source cross-vendor tool that provides insight for game developers to understand how their content is running on Mali GPUs. Using the profile information, game developers can optimize game content for increased frame rates on devices that use Mali GPUs. Not only can Arm provide optimizations in the driver via updateable drivers, game developers can optimize their content with the profiler support in the driver. This represents a real win-win.

Vulkan Performance Samples

To demonstrate the profiling capability of the Android GPU Inspector, we can use the excellent Khronos Group Vulkan Samples. This project, which demonstrates best practices for the Vulkan API, contains a suite of performance examples. These provide toggles to show the difference that API usage has on frame rates and memory bandwidth. One of the examples shows the difference using two or three swapchain images, commonly referred to as double and triple buffering. At the cost of a deeper pipeline and input latency, we can allow the GPU to start work on the next frame earlier. This is because there is no need to wait for the buffer being displayed to finish completion. This example provides buttons for switching between double and triple buffering and measures the time to draw each frame. With vsync enabled, switching to triple buffering can lead to a jump from 30fps (33.3ms) to 60fps (16.6ms) on an Arm Mali-G76 GPU. Using Android GPU Inspector, we can record a trace of the GPU operation during this example, which clearly shows the utilization of the GPU increasing when triple buffering is turned on.

Double Buffering

Double Buffering

The previous screenshot is of the example running with double buffering enabled. Notice that the frame time is 33.4ms, which represents 30fps. The toggles at the bottom allow easy switching between double and triple buffering.

Android Inspector Trace Android GPU Inspector Trace

The previous image shows an Android GPU Inspector trace as an example. The greyed out left section of the trace is when double buffering is enabled and the right is when triple buffering is enabled. The greyed section clearly shows that the GPU is stalling, as it is waiting for a buffer to draw into. This is shown by the gaps between the submission of work and the GPU hardware queues. This greatly affects the active GPU cycles.

Once triple buffering is turned on, the submission of work is far more regular as the GPU is not waiting on a buffer to draw into. The utilization of the GPU is much higher, shown by the more densely populated GPU active cycles counter.

Triple buffering

Triple Buffering

With triple buffering enabled, the time to draw a frame drops to 16.7ms, which is a jump to 60fps. This is due to the GPU being able to start work on the next frame in the third buffer straight away.

Crytek Neon Noir

While the Vulkan Samples can exhibit clear performance improvements in GPU utilization, as shown in Android GPU Inspector, game content is often far more complex. Game development and technology company Crytek has a hardware agnostic PC demo named Neon Noir which provides amazing graphics fidelity for modern hardware. Crytek worked with Arm and Google to port the demo to mobile and utilize Arm Mali GPUs. This is an incredible achievement, as moving such intense graphics load to a mobile GPU is a challenging process. Therefore, profiling is required to determine how to load the GPU effectively. The following traces were captured on beta code that does not represent the final demo result but does show remarkable profile guided optimization improvements.

Android Inspector Trace for Crytek Neon Noir - Before Android GPU Inspector Trace for Crytek Neon Noir - Before

The previous image is of a trace showing the work required for a frame. The profiling data clearly shows an opportunity to load both GPU hardware queues more. There are dependencies between the vertex and fragment work that can be untangled to allow more work to run in parallel.

A frame is taking 78ms to render on a Mali-G76 resulting in 12fps. While 12fps is low, moving such a heavy graphics load to a mobile GPU is a noteworthy accomplishment. This profiling data is a candid insight into the engineering process of moving game content between platforms.

Android Inspector Trace for Crytek Neon Noir - After Android GPU Inspector Trace for Neon Noir - After

After careful analysis of the profiling data, Crytek were able to optimize the GPU work. The previous image of the trace shows superior loading of the GPU hardware queues and greatly reduced frame rendering time. The reduction of 33ms results in an improvement of 43% and jumps the frame rate to 22fps.

What is remarkable about this iteration of the demo was that Crytek was able to improve the usage of the GPU. At the same time, another remarkable feature was adding extra graphical features to the demo, such as volumetric fog and post-processing anti-aliasing.

Conclusion

The traces provided in this blog post can be opened in the latest developer releases of the Android GPU Inspector. They are provided as zstd compressed traces, so make sure to unpack the .perfetto trace file before opening it for visualization. Android GPU Inspector is in the development phase, so the traces and user interface are not representative of the final experience. Arm, Google, and Samsung have been collaborating closely to support the Samsung Galaxy S10, Samsung Note 10 and Samsung S20, with further devices in the pipeline. Moreover, Arm launched the Arm Mali-G78 and Arm Mali-G68 GPUs, which bring further performance and efficiency improvements for higher quality gaming experiences on mobile. Finally, Crytek has also continued to use Android GPU Inspector to optimize their content, so expect further technical detail in the near future. 

Learn more about Mali GPUs

Anonymous
Mobile, Graphics, and Gaming blog
  • Optimizing 3D scenes in Godot on Arm GPUs

    Clay John
    Clay John
    Exploring advanced mobile GPU optimizations in Godot using Arm tools like Streamline and Mali Offline Compiler for real-world performance gains.
    • July 10, 2025
  • Optimizing 3D scenes in Godot on Arm GPUs

    Clay John
    Clay John
    In part 1 of this series, learn how we utilized Arm Performance Studio to identify and resolve major performance issues in Godot’s Vulkan-based mobile renderer.
    • June 11, 2025
  • Bringing realistic clothing simulation to mobile: A new frontier for game developers

    Mina Dimova
    Mina Dimova
    Realistic clothing simulation on mobile—our neural GAT model delivers lifelike cloth motion without heavy physics or ground-truth data.
    • June 6, 2025