For the past few years, Arm’s partners have shipped over one billion Mali GPUs annually, making Mali the number one shipped GPU on the planet. This number is only going to increase as many more different tiers and types of devices enable graphic intensive use-cases, from advanced mobile gaming through to XR (virtual reality (VR) and augmented reality (AR)). All of this makes Mali the most widely used GPU for mobile development across the ecosystem.
2019’s Mobile IP launch saw a step-up in our Mali range of products, with our very first GPU based on the new Valhall architecture. The Arm Mali-G77 GPU represented a big leap in performance and energy efficiency. In 2020, we have been able to take this one step further through Arm Mali-G78 GPU. This represents the most performant GPU for premium mobile devices to date. A fact supported by the numbers, with Mali-G78 bringing a 25 percent improvement in performance device to device, delivering ‘game-changing’ graphics and all-day gaming on mobile.
Arm's highest performing GPU
Mali-G78 is designed with developers and the end user in mind. Making sure premium mobile devices are able to provide the immersive entertainment experiences that developers are building, and users are craving. Mali-G78 enables high-quality mobile gaming experiences, with console games now available on mobile. Alongside higher quality comes greater quantity, with Mali-G78 bringing longer battery life to premium mobile devices. This means users can enjoy their favorite gaming applications for longer than ever before. Mali-G78 also brings a further machine learning (ML) performance boost, helping to enable more complex gaming, video, camera, and security ML features on mobile devices.
Immersive experiences through Mali GPUs
As discussed in a previous blog, the games industry is experiencing a shift in focus towards being ‘mobile-first’. According to the market intelligence agency Newzoo, mobile accounted for more than 46 percent of the global games market in 2019, reaching $68.2 Billion in revenues. Mobile gaming is set to continue growing over the next few years, outpacing both PC and console gaming.
This phenomenal growth is seeing more premium gaming titles coming to mobile. Popular high-fidelity and multi-user games, such as Fortnite and PlayerUnknown's Battlegrounds (PUBG), are being played on mobile devices as well as consoles and users expect a similar experience. This is where Mali-G78 plays a hugely important role.
Mali-G78 provides the performance boost required to make these PCs and console-like gaming experiences possible on mobile. There is a 15 percent performance density improvement for gaming content compared to Mali-G77. This means that Mali-G78 will give more performance for the same amount of area as the previous generation. The performance boost is made possible by four key features:
We have increased the maximum core count to enable our highest ever performance. The maximum core count on Mali-G77 was 16, so we have pushed for greater performance with support for up to 24 cores. Asynchronous Top Level – a feature we believe to be a ‘game-changer’ for GPU performance – then ensures that all this performance is delivered efficiently and effectively across all the cores. This squeezes as much performance out of mobile games as possible, ensuring maximum performance productivity.
The benefits of Asynchronous Top Level
Tiler improvements add an extra layer of quality to mobile games. Games that are adapted from PC and console to mobile often have extremely complicated assets and sophisticated scenes. These cause performance sticking points and bottlenecks. Improvements to the tiler reduce the vertex load on the GPU for these complex scenes and assets. This improves performance for complicated PC and console-like gaming content.
Bringing complex gaming content to life
Finally, we have enhanced the fragment dependency tracking on Mali-G78. Again, this particularly affects mobile games with complex gaming scenes involving smoke, trees, and grass. The results from this feature change are impressive. On different frames, we see up to 17 percent performance improvements on top mobile games compared to Mali-G77 performance.
We are already working with the game and technology development company Crytek to bring their renowned CRYENGINE game engine to the Android mobile ecosystem first. We are working closely with Crytek and Google to ensure that Crytek’s flagship ‘Neon Noir’ demo fully utilizes Vulkan on Arm Mali to achieve outstanding graphic fidelity. This is a great example of developers utilizing and optimizing the benefits of Mali GPUs to achieve truly immersive mobile gaming.
Mali-G78’s performance improvements are impressive but there is always more to do from a system perspective. Mobile devices need longer battery life so users can enjoy their favorite gaming applications on the go. Users want their mobile devices to last the whole day without needing to charge, even if the applications being used are highly compute intensive and, as a result, draining the battery. Therefore, we need performance to be improved in a sustainable way. Alongside the 15 percent performance density uplift, Mali-G78 has 10 percent better energy efficiency.
Like with performance, the Asynchronous Top-Level feature plays a vital role in energy efficiency. Using Asynchronous Top Level enables a reduction in power, so content is generated in a sustainable way. This means that when the device is outputting content at a desired frame rate, it can clock down to save energy. Increasing the Asynchronous Top Level for this task uses a bit more energy, but the energy saving from reducing the frequency of the shader cores are far higher. This is because the shader cores use 90-95 percent of the GPU’s energy budget.
Energy reductions through Mali-G78
Another important feature leading to better energy efficiency in Mali-G78 is the new Fused multiply-add (FMA) . This has been completely redesigned from the ground up, leading to a 30 percent energy reduction to the unit. The FMA unit is responsible for most the calculations that happen inside a GPU. Therefore, it was a good candidate to target for energy reductions.
Although the GPU’s primary function is graphics processing, the parallel data processing capability makes it suitable for running ML workloads. While the CPU and NPU remain the primary processors for ML, as use-cases get more complex some of these will be offloaded to the GPU. The main ML use-cases for the GPU are linked to security features on the device, different camera, and video modes and applications with AR features.
Focusing specifically on applications, the role of ML on the GPU is important. Real-time AR emojis are a fun feature on modern communication applications, such as Snapchat. This transposes AR cartoon features onto the user’s face when taking a photo or video. The GPU is used to detect the emotion of the face to auto-select the appropriate emoji. Face tracking within the photo or video frame can also be carried out by the GPU. Moreover, more compute-intensive AR-based applications are also possible on the smartphone thanks to ML on the GPU, such as mobile gaming apps that utilize AR features. These games use the GPU to transpose the AR graphics and features onto real-world environments, with ML improving this process.
On-device ML performance boost
To carry out these various ML-based tasks, Mali-G78 has seen an average 15 percent performance improvement for various ML workloads compared to Mali-G77. This improvement has been made despite Mali-G77 bringing a huge 60 percent improvement to ML performance over previous GPU generations. Yet again, Asynchronous Top Level is vital in boosting ML performance, as clocking the shader cores helps with the various ML use-cases on the GPU.
Alongside Mali-G78, we are also launching the Arm Mali-G68 GPU. This is the first sub-premium Mali GPU for 2021 devices. It inherits all of the features from Mali-G78, such as the tiler improvements and the new FMA unit in the execution engine, but instead supports up to 6 cores rather than 24. Therefore, it offers an alternative for those wanting near-premium performance at a lower cost.
The features of Mali-G68
The sub-premium GPU tier was developed after we listened to our partners who wanted to scale premium features and technology across their portfolio of devices. This helps to bring premium use-cases, like high-performance gaming, to a wider audience of developers and consumers. In addition, our partners wanted a cost reduction in the design and layout work required for the multiple GPU designs. Mali-G68 allows them to reuse their design work and scale them down to a lower silicon area.
As well as constantly improving our Mali GPU products, we are creating key ecosystem partnerships and tools that improve the overall developer experience. This allows developers to create the most exciting, engaging, immersive, and highest performing applications,
Recently, we have looked at our performance analysis tools to make it as easy as possible for developers to optimize their own content to make it run even better on Mali GPUs. One good recent example is Performance Advisor, which is freely available from the Arm website. This tool makes it easy for developers to optimize content on Mali by doing the hard work around performance optimization. It quickly detects bottlenecks and even provides some performance improvement suggestions. This is important, as developers do not want to spend time working out where problem areas are in their code. They want to immediately know where the problem is, so they can spend their time actually solving it.
Arm Performance Advisor
Away from tools, our partnership with Unity also improves the developer experience and will bring gaming performance by default. This enables developers to spend more time creating compelling content, and less time optimizing code. We are also working with Unity to help bring the power of the Burst Compiler to Android to enhance multiprocessor processor performance and power management. Furthermore, we are integrating our own performance analysis capabilities with Unity tools to enable a more seamless developer experience and improve Unity performance on Arm’s technologies.
Due to the reach of Unity among game developers, we believe that this partnership is a significant development for the mobile ecosystem. The Unity game engine powers more than 50 per cent of all mobile games. Therefore, a vast number of gaming applications will not only be brought to market quicker, but also be higher performing and more efficient. Ultimately, this benefits the end-consumer playing their favorite mobile games, using their favorite applications or even creating their own content.
Mali-G78 represents the perfect balance of performance and energy-efficiency improvements. The GPU provides the performance needed for silicon vendors, device manufacturers and developers to enable the range of complex immersive entertainment experiences on mobile devices. This includes optimizing high-quality PC and console-like gaming on mobile, more entertainment and more ML-based use-cases and applications. However, Mali-G78 also offers energy reduction benefits enabling all-day battery life. This means all these immersive entertainment experiences can be enjoyed for longer without mobile devices losing power while users are on-the-go.
Summary of Mali-G78 features
The complementary nature of performance and energy efficiency is best illustrated through the brand-new features of 24 shader cores and the ‘game-changing’ Asynchronous Top Level. This technology combination, alongside our range of performance optimization tools, provides maximum performance in the most productive and efficient way possible. Unlocking many of Mali-G78’s benefits for developers building the applications of the future and 2021’s next-generation mobile devices.
Learn more about Mali-G78Visit the Arm newsroom blog