In the past year, our partners have shipped over one billion Mali GPUs in many different consumer devices. This is the fifth consecutive year that we have achieved this milestone, making Mali the most pervasive GPU on the planet. In 2021, we are showing our commitment to this market leading position with Mali by introducing the widest range of GPUs that we have ever announced at a single time. Each has targeted performance and efficiency points for different market segments, use cases and consumer devices. Crucially, all the new GPUs bring unprecedented flexibility, which enables our partners to precisely focus on the right set of optimizations for their individual products. Each GPU is a key component of our new Total Compute solutions that are all optimized at a system level. These offer different levels of performance, efficiency, and scalability across multiple consumer device markets and use cases.
The flagship Arm Mali-G710 GPU is our highest performing GPU ever, for better and longer entertainment experiences, targeting premium smartphones to deliver compute intensive experiences, such as AAA high-fidelity gaming. The Arm Mali-G510 GPU represents the perfect balance of performance and efficiency, targeting mid-range smartphones, premium DTVs, set-top boxes (STBs) and Chromebooks. Finally, the Arm Mali-G310 is our most performant efficiency GPU, with huge performance gains compared to the previous generation Arm Mali-G31 GPU. Mali-G310, the first ever Valhall-based efficiency GPU, targets entry level smartphones, entry, and mid-range DTVs and STBs, smartwatches, and AR and VR wearables.
Mali-G710 provides uplifts in performance (20 percent), energy efficiency (20 percent) and machine learning (ML) (35 percent) compared to the previous generation Arm Mali-G78 (ISO process). Every year a new generation of premium and flagship smartphones must deliver an increased level of performance to drive the newest consumer experiences.
Gaming is a big focus for Mali-G710. 2020 was a huge year for mobile gaming, with the gaming market intelligence agency Newzoo estimating that mobile game revenues hit $76.7 billion. This represents a 12 percent increase from 2019, with it now outstripping PC and Console gaming revenues. Mobile gaming experiences are also getting more complex, with more premium AAA gaming experiences coming to mobile. Smartphones need to match the greater gaming complexity to make these experiences possible through the following enhancements:
Mali-G710 brings a range of new ‘game-changing’ features and technologies that service the demands for gaming enhancements on premium smartphone devices. These new features also enable uplifts in performance, energy efficiency and ML.
Command Stream Frontend (CSF)
The introduction of the new command stream frontend (CSF) is a significant change, replacing the job manager that existed in previous Mali GPUs. CSF aligns Mali GPUs to the requirements of modern APIs, such as Vulkan, and future mobile gaming content trends. One of the biggest benefits of CSF is that it reduces the amount of work that the CPU has to perform. This, in turn, lowers the power budget that needs to be provided to the CPU, enabling the GPU to perform more tasks. A great example of how our Total Compute strategy of optimizing across the system brings real, tangible benefits. Importantly, this key technology is completely transparent to developers who can continue to target the familiar GLES and Vulkan APIs.
Looking deeper at the Shader Core
One of the areas we have extensively worked on for the Mali-G710 is the Shader Core, which has been heavily redesigned for increased performance density. Mali-G710 has a configurable number of cores, starting from 7 cores and scaling up to 16 cores – this is less than Mali-G78 which scaled up to 24 cores. However, the cores are bigger, more performant and, more energy-efficient.
Focusing on the specifics inside the shader core, first, the execution engine has been redesigned for energy efficiency improvements. We have added a second execution engine in each shader core, doubling the compute capability of each core and making more efficient use of shared resources. Adding this second execution engine doubles FMA capabilities from Mali-G78, delivering 64 FMA per cycle per core. We focused on the execution engine for the redesign. It was an obvious target for energy reductions (especially when adding a second engine) with it representing 60 percent of the shader cores’ overall energy demand in Mali-G78. The execution engine redesigns in the Mali-G710 leads to an energy saving of 20 percent overall (ISO process). This helps to deliver Arm’s highest ever energy efficiency in a premium GPU, leading to longer battery life on the target consumer devices. To the end user, this means they can “do more” and “play more”, with longer connected productivity and entertainment ‘on-the-go’.
In addition to the extensive work on the execution engine and shader core, we have redesigned the texture unit. This doubles the texture performance of Mali-G710 compared to the previous generation. However, this doubling of the performance does not cost twice the area. With the new unit, we have doubled the performance while only increasing the area by 50 percent, meaning a significant improvement in performance density. The benefit of enhanced texturing capability is especially applicable for complex gaming assets and scenes.
As with every premium GPU, we bring yet more ML uplifts. This is important as the GPU is now being used for a variety of different ML-related tasks, particularly for image enhancement and re-training. This brings advanced user experiences to the smartphone device, such as new camera and video modes, as well as security enhancements. For example, with portrait mode in smartphone cameras (a common ML workload), the GPU can be used to extract the depth map and segmentation as part of the ML compute pipeline. Being able to carry out ML tasks on the GPU is important, as it can bring precision flexibility for the new frontier of networks in the mobile space.
Just like in 2020, we are announcing an accompanying sub-premium GPU called the Arm Mali-G610. This inherits all of the features from Mali-G710, like the new CSF, but it has fewer configurable shader cores (1-6). This brings near premium performance to sub-premium smartphones that are at a lower price point, helping to bring premium use-cases, like high-performance AAA gaming, to a wider audience of developers and consumers. Partners who license Mali-G710 can reuse that investment to rapidly bring the latest GPU features to a wider audience in the sub-premium segment.
This year we are delighted to launch our performance and efficient GPUs alongside the yearly premium GPU. The mid-range and entry-level segments now require more compute and more performance than before, with both Mali-G510 and Mali-G310 delivering new features and technologies to a broad range of consumer devices. Mali-G510 and Mali-G310 bring significant performance uplifts compared to the previous generation of GPUs, while offering features to reduce bandwidth that gives an additional boost to performance and power consumption savings. At the same time, both GPUs have been designed to offer true scalability. They offer a wide range of configuration choices to achieve precise area and performance points needed by our partners and the Arm ecosystem to create next-generation consumer devices. These include entry and mid-level smartphones, AR and VR applications, DTVs, STBs, Chromebooks, and tablets.
Mali-G510 offers the perfect balance of performance and efficiency. It delivers a 100 percent performance improvement, 22 percent energy savings for longer battery life and a 100 percent ML uplift compared to the previous generation Arm Mali-G57. It is designed to deliver selected performance and energy operating points across a wide range of different configurations.
Meanwhile, Mali-G310 offers huge performance uplifts compared to the previous generation Arm Mali-G31 across three performance areas – texturing performance (6x), Vulkan performance (4.5x) and Android UI content (2x). These huge improvements are due to Mali-G310 being the first ever efficiency GPU built on the Valhall architecture. It also benefits from the micro-architectural changes over our last three GPU generations. Essentially, Mali-G310 is designed to deliver the highest performance at the smallest area cost.
There are various trends within the different target devices in the mid-range and entry-level markets that are pushing this demand for greater performance. These include:
The significant performance uplifts of Mali-G510 and Mali-G310 are made possible by both GPUs taking on features and enhancements from the premium Mali-G710. These are then optimized for different performance, power, and area (PPA) points. The big difference with the Mali-G710 is the number of shader cores, with Mali-G510 having 2-6 configurable shader cores and Mali-G310 having one. However, both GPUs inherit CSF, the redesigned and additional execution engine, and the redesigned texture unit from Mali-G710. Mali-G510 and Mali-G310 also adopt additional features on top that cater to a broad range of devices. For example, Mali-G510 provides formats for better HDR support, Arm Frame Buffer Compression (AFBC) uncompressed buffers and the new Arm Fixed Rate Compression (AFRC) for bandwidth reductions (more on that later). Similarly, Mali-G310 also offers formats for better HDR support and AFBC uncompressed buffers. AFRC is optional for Mali-G310, but the GPU also offers foveated rendering (a feature of Mali-G57) for an AR and VR boost.
The features described above allow each GPU to have different configuration options to address specific devices and different performance and efficiency needs. Ultimately, this means greater scalability for both GPUs. Mali-G510 has ten configuration options, while Mali-G310 has five. In fact, Mali-G510 has the highest level of product configurability and granularity of any Mali GPU ever. Each configuration addresses various area and performance points, and differing compute and texturing needs. Mali-G310 offers a range of configurations well suited to area-focused implementations by pairing a configurable single shader core with an area optimized CSF and tiler.
We are introducing our visually lossless fixed rate compression, AFRC, into the market for the first time with Mali-G510 and optionally with Mali-G310. ARFC offers exceptional visual quality with high fixed compression rates. For many years, Arm and our partners have made use of lossless Arm Frame Buffer Compression (AFBC). AFBC delivers lossless compression; however, to be lossless, memory bandwidth reduction is not guaranteed. The new AFRC technology guarantees a bandwidth and memory footprint reduction, depending on the level of compression and type of content, at a minimum area cost. This translates into performance uplifts and energy savings due to less data being read and written to DRAM. The use of AFRC framebuffers alone gives a peak of 60 percent reduction in bandwidth, while providing an 80 percent increase in peak performance. This, in turn, reduces the cost of the memory subsystem and DRAM, which then brings down the cost of the SoC itself.
We are not only committed to continuous improvements with each GPU generation. We are also making sure that this performance is easier to access with investments into the wider software ecosystem. This is important as Mali is the most widely used GPU for mobile development across the ecosystem.
We are delivering a Total Gaming experience by working with leading game engines, studios, and developers to optimize the performance on Mali-based mobile devices through new graphics technologies, tools, and developer education resources. Arm has a long-standing relationship with Unity where we work together to provide graphics improvements and resources to developers and their game applications. Bringing smoother, more efficient gaming experiences to the billions of consumers who play Unity-generated content on Mali-based mobile devices.
We are also continuously refining our free performance analysis suite for game developers known as Arm Mobile Studio, which enables fast and intuitive graphics optimizations for Mali GPUs. This provides whole system processing and performance activity analysis, allowing ‘non-experts’ to predict and improve the performance and efficiency of their mobile games. The aim is to make sure that building on Mali GPUs is as straightforward as possible. Already, Arm Mobile Studio is being utilized by leading games studios like Wargaming and King to help optimize their gaming content for mobile. For Wargaming, we integrated Arm Mobile Studio Professional into their CI workflow, bringing great cost savings and performance benefits when it targeted World of Tanks for mobile. King also used Arm Mobile Studio in the development of Crash Bandicoot: On the Run! for mobile, which led to significant performance improvements for the game.
In 2021, we are introducing GPUs for all market segments. The suite of GPUs covers a broad range of consumer devices, a wide range of entertainment and productivity experiences, with the flexibility and scalability to target different performance and efficiency needs. While the Mali-G710 continues to push premium performance that empowers console-like AAA gaming experiences on mobile, the adoption of premium features and enhancements by Mali-G510 and Mali-G310 means more advanced user experiences and graphics across a broad range of lower-cost consumer devices. We believe that this breadth of GPU capabilities and the unmatched flexibility will continue Mali’s market leading position as the world’s number one shipping graphics processor.
[CTAToken URL = "https://www.arm.com/products/silicon-ip-multimedia" target="_blank" text="Learn more about Mali GPUs" class ="green"][CTAToken URL = "https://www.arm.com/solutions/mobile-computing" target="_blank" text="Learn more about the Total Compute Solutions" class ="green"]