In May 2021, we introduced Arm’s first ever Total Compute solutions. This is the realization of a new Total Compute strategy and approach to SoC design where Arm is focused on moving beyond individual IP elements and designing and optimizing the whole system to create use-case driven solutions to power the next decade of compute.
Arm's Total Compute solutions aim to address the three core SoC design challenges of compute performance, security, and developer access. Switching to a new solutions-focused approach enables Arm and our partners to deliver complete SoC solutions optimized for specific use cases and best-in-class performance on next-generation devices. Complex use cases on these devices require greater performance, but doing so on the IP alone is not enough. This is why Arm is focusing on optimizing performance at a system level across all IP boundaries. For security, diverse IP components in an SoC leads to engineering challenges when trying to create a consistent security approach. Finally, for developers, multiple architectures and various different IP components lead to time-consuming (and thus costly) software development, and a fragmented ecosystem. Therefore, being able to offer more accessible and performant solutions consisting of IP, software, and tools will lead to a more seamless development process, and more immersive applications.
In response to these challenges, Arm’s Total Compute solutions offer a full suite of hardware IP (including the latest Armv9 CPUs, Mali GPUs and System IP), physical IP, software, tools, and standards. These help to build the best SoC across different consumer device markets. Each type of solution offers different levels of performance, efficiency, and scalability. These are divided into three broad categories: Premium, designed for top performance and connected user experiences; Performance, designed to address a wide range of performance and efficiency requirements; and Efficiency that are ultra-scalable solutions for best-in-class cost efficiency.
Following the launch of the Total Compute solutions, we will be exploring how these different types of solutions can be applied to different consumer device segments in a series of blogs. We are starting with the smartphone segment.
Despite the smartphone form factor not changing since its introduction to the market in 2007 – albeit with some innovation through folding phones – the features and use cases on devices continue to grow – both in terms of number and compute complexity. The smartphone is now front and center in people’s digital lives, offering a range of services that digitally empower users to be more productive, creative, and connected.
Arm powers the world’s smartphones, from high-performance premium to entry. Our scalable IP balances high-performance for more complex and compute-intensive workloads with sustained and efficient performance to deliver long battery life for all-day productivity and play. However, there are different performance and energy efficiency requirements across different tiers of smartphone devices. This is often based on the user experiences they provide. For premium smartphones, users want console-quality gaming, high-resolution screens and superior camera quality. Mainstream smartphone users still want a premium experience from the same apps and services, but at a more competitive price point. Different Total Compute solutions target the premium and mainstream tiers of smartphone, with different IP being featured in these solutions based in the requirements of each smartphone tier.
Arm’s Total Compute solutions offer different configurations for the premium and mainstream smartphone markets. The 1+3+4 CPU configuration of 1x Arm Cortex-X2, 3x Arm Cortex-A710 and 4x Arm Cortex-A510 alongside an Arm Mali-G710 GPU is targeted for premium smartphones, with this solution enabling peak and sustained performance benefits for various use cases. Meanwhile, the 4+4 CPU configuration of Cortex-A710 and Cortex-A510 alongside an Arm Mali-G510 GPU allows our partners in the mainstream smartphone market to target a broad range of smartphone devices at various cost and performance points. But they are still being able to offer impressive sustained performance for key smartphone use cases. Underpinning the CPUs and GPUs is our System IP – CoreLink Interconnect CI-700 and NI-700 – that provide improved energy efficiency and system performance to add further improvements across any Total Compute solution. These can be targeted for premium, mainstream or even entry-level smartphone markets.
With smartphones now the primary device choice for most consumers of technology, all devices – regardless of premium and mainstream – require impressive performance, power and energy efficiency, This allows users to better enjoy all of their favorite smartphone applications and experiences for longer. This is what we aim to achieve through our Total Compute solutions for the premium and mainstream smartphone markets.
Premium smartphones are often the flagship device for mobile OEMs. Due to the high-performance requirements of these devices, they would require one of the most premium Total Compute Solutions. This features the Cortex-X2, Cortex-A710, Cortex-A510 CPUs, and a Mali-G710 GPU. This is supported by our CoreLink CI-700 and NI-700 Interconnect technologies for improved energy efficiency and system performance.
The best CPU approach for Total Compute solutions targeting premium smartphones is a tri-cluster design featuring Cortex-X2, Cortex-A710, and Cortex-A510. This 1+3+4 configuration delivers high performance through the 1x Cortex-X2 and then 3x Cortex-A710 and 4x Cortex-A510 for sustained use-cases on premium smartphones like AAA gaming and best-in-class efficiency.
Cortex-X2, which is part of the Cortex-X Custom Program that focuses on ultimate performance, provides peak performance for a faster user experience on the smartphone device. The energy efficiency improvements of Cortex-A710 enable improved sustained performance and maximizes battery life, but still fits into the thermal constrained envelope of premium smartphones. Finally, Cortex-A510 provides further performance uplifts that improve the overall cluster multi-core performance and efficiency for everyday tasks on smartphones. The performance boost means workloads can run longer on the Cortex-A510 before switching to Cortex-X2 or Cortex-A710. This helps to boost the overall efficiency in the CPU cluster, as fewer compute workloads need to run on the bigger cores.
The flagship Mali-G710 is Arm’s highest performing GPU ever. It offers a range of new features and technologies designed specifically for gaming experiences on premium smartphone devices, such as AAA gaming. Command stream frontend (CSF) aligns Mali GPUs to the requirements of modern APIs, such as Vulkan. It also helps to reduce the amount of work that the CPUs have to perform, which lowers the power budget. Meanwhile, Mali-G710’s larger shader cores offer greater performance and energy efficiency to enable longer battery life on premium smartphones for improved gaming ‘on-the-go’. Finally, the redesign of the texture unit enhances complex gaming assets and scenes on mobile.
Just like for premium smartphones, the Total Compute solutions for mainstream smartphones will have Cortex-A710 and Cortex-A510 as part of the CPU configurations. Mainstream smartphones still require premium-like user experiences on the device, particularly for key sustained use cases like gaming. The Cortex-A710 – with its 10 percent performance improvements that are balanced with advances in efficiency – is the right candidate for this market, with the 4+4 CPU configuration still the most popular among current smartphone devices. It is important to note that this still provides impressive performance improvements for the end user, but helps to deliver the device at a lower price point.
Another change is the GPU, with the mainstream smartphone Total Compute solution adopting the mid-range Mali-G510 GPU instead of the premium Mali-G710. Arm’s mid-range Mali-GPUs now take on more compute and performance than ever before. Mali-G510 adopts features and enhancements from the premium Mali-G710, with these then being optimized for different performance, power and area (PPA) points. However, Mali-G510 also adds further features on top. These include formats for better HDR support, Arm Frame Buffer Compression (AFBC) uncompressed buffers and the new Arm Fixed Rate Compression (AFRC) for bandwidth reductions. The adoption of a range of features and enhancements by Mali-G510 means more advanced user experiences and graphics across a broad range of mainstream smartphone devices.
Our Interconnect technologies are vital components of our new Total Compute Solutions and can be used across both premium and mainstream smartphone markets. Both CoreLink CI-700 and CoreLink NI-700 are highly configurable IP designed to enable the very best solution performance, while also providing power and bandwidth reductions across the system to optimize various use cases for premium and mainstream smartphone devices.
Moreover, all of the CPU configurations in our Total Compute solutions are bound together by Arm’s new DSU-110, which is the backbone of any CPU cluster. This enables our partners to address diverse smartphone devices across various PPA points.
AAA mobile gaming on premium and mainstream smartphones is a great example of how Arm’s Total Compute solutions deliver tangible benefits to complex, real world and specialized compute workloads. The different solutions provide optimizations across Armv9 CPUs, the latest Mali GPUs, and their software drivers running on the CPU cluster. These include microarchitectural innovations in the CPU, the introduction of new GPU features like CSF to reduce the CPU load, and new feature improvements in the Mali-G710 and Mali-G510 GPUs, such as the redesigned texture unit and execution engines, for more demanding gaming content. Furthermore, the new Interconnect CoreLink CI-700 supports a System Level Cache (SLC), which, with new features introduced in the GPU, reduces latency and system power consumption for various gaming content running in the system.
However, Arm’s Total Compute solutions are not just about IP improvements for mobile gaming. We are investing in the gaming ecosystem and working with leading game engines, companies, and developers to ensure that their gaming content can be optimized for Arm IP. Arm Mobile Studio is a great example of a tooling platform that supports developers to optimize their gaming content and unlock further performance and efficiency benefits. It is a suite of free-to-use performance analysis tools that analyze the CPU activity, GPU activity, and content metrics of games. This means game developers can quickly identify and fix any problems that might cause the game to run slowly, overheat the device, or quickly drain battery life.
To learn more about our Total Compute solutions, watch this new video about how they achieve accelerated performance growth across key compute workloads and use cases, and all consumer devices.
The next blog in the series explores Total Compute solutions for the laptop market.
[CTAToken URL = "https://www.youtube.com/watch?v=hGXSmj2_g7g" target="_blank" text="Learn more about Total Compute" class ="green"]
looks a perfect update for pixel 6