Many mobile phone users have experienced the disappointment of having their favourite device run out of juice when they are far away from a power socket. Some of them may also have experienced the warm but uneasy feeling of their device overheating while they are hooked on an astonishingly addictive new 3D game. In the engineering world, these problems of battery autonomy and thermal constraints are known to be the two facets of the most significant challenge for the mobile computing industry: Designing high performing processors that can withstand the increasing demands for rich visual experience on a limited power budget.
This week at TechCon we are excited to introduce a new generation of ARM® Mali™ GPUs. The ARM Mali-T760 GPU and the ARM Mali-T720 GPU extend the previously accepted boundaries of energy efficiency and have been optimized to address the divergent requirements of the high performance and low cost market segments respectively.
The mobile computing industry has been booming over the last few years, with smartphone volumes increasing significantly and all indications pointing towards the acceleration of this trend in the future. Smartphone shipments are expected to grow by a massive 5x factor from 2010 to 2015. Emerging markets, some with massive populations, are key contributors to this growth.
However, due to financial restrictions on a wide part of the population in these countries, the entry level sector benefits the most and will account for almost half of overall smartphone volumes. These entry-level devices need to provide a similar user experience to their higher-end counterparts, but with manufacturing costs well below $150 to ensure a viable profit margin. In order to achieve that, manufacturers optimize costs in every step of the manufacturing process. Time to market is also very critical for this segment because prices drop dramatically when a competitor brings a similar solution to the market, squeezing the already low profit margins.
At the other end of the smartphone spectrum, superphone development is driven by the demand for increasingly higher performance within the static mobile power budget. High-end tablets have already broken the barriers of full HD 1080p screen resolution and are pulling the pixel density race to a whole new level, setting 4K2K (UHD) resolution as the new ultimate target. At the same time High Dynamic Range applications demand 10-12 bits precision per colour plane, contributing to the ever increasing amount of data that a graphics processor needs to handle. New memory technologies like LPDDR4 have been deployed to sustain this increasing need for higher bandwidth. However, the power consumption imposed by higher bandwidths has not been an easy problem to resolve and becomes the limiting factor for high-end mobile devices.
The new ARM Mali GPUs address the requirements of these two different market segments by introducing a variety of new features that redefine what is technically possible.
The ARM Mali-T760 GPU is designed with a focus on high performance at the same time as high energy efficiency. It reaches a 400% improvement in these metrics over previous generations of ARM Mali GPUs.
It supports all new graphics and GPU Compute programming interfaces (APIs) such as Direct3D® 11.1 feature level 11, OpenGL® ES 3.0*, and OpenCL™ 1.2, so guarantees compliance with the latest and greatest graphics and compute content.
A significant achievement of the ARM Mali-T760 GPU is the new L2 cache interconnect that provides a cache coherent view of every L2 cache instance for every shader core and makes sure that memory bandwidth is evenly distributed among them. It supports extended scalability of up to 16 shader cores with linear performance improvement which allows the highest levels of performance without compromising on area efficiency. From a physical implementation perspective, it reduces the wire count between the L2 cache and shader cores and so enables easy timing closure, high layout utilization and low pin congestion.
Smart Composition is a new technology introduced for the first time in the ARM Mali-T760 GPU. It has been developed to reduce bandwidth while reading in textures during frame composition. Smart Composition can reduce standard Android™ UI texture read bandwidth by better than 50%. By analyzing frames prior to final frame composition, Smart Composition determines if any reason exists to render a given part of the frame or whether the previously rendered and composited part can be reused. If that portion of the frame can be reused then it is not read from memory again or recomposited, thereby saving additional computational effort. In addition the ARM Mali-T760 GPU supports ARM Frame Buffer Compression (AFBC), the unique lossless compression capability implemented to optimize bandwidth usage further, Transaction Elimination and Adaptive Scalable Texture Compression.
Other features of the ARM Mali-T760 GPU include YCrCb frame buffer output and hardware assisted global illumination. Both are designed to increase fidelity and balance memory bandwidth to the system.
The ARM Mali-T720 GPU is designed for performance density and ease of implementation in order to address the cost and time to market challenges of the entry-level smartphone segment. It achieves more than a 150% improvement in energy efficiency and graphics performance over previous generations of cost-optimized ARM Mali GPU solutions.
The ARM Mali-T720 GPU is based on the Midgard architecture, which enables it to benefit from the latest API support, plus bandwidth optimization features such as ASTC textures and Transaction Elimination. However, it has gone through extensive micro-architecture modifications that boost its performance efficiency close to the unparalleled density that can be achieved by the previous generation Utgard architecture. More specifically, it brings industry-leading OpenGL ES 3.0 support* to the mid-range mobile segment and is tuned to provide excellent graphics and compute support for the Android operating system, including support for RenderScript and FilterScript. Additionally, if Linux is the chosen operating system then OpenCL can be used alongside OpenGL ES 3.0.
To dramatically reduce implementation effort and enable fast time to market, the ARM Mali-T720 GPU has been optimized for a reduced number of routing layers while increasing layout utilization to maximize layout and time closures. In addition, ARM POPTM IP and hard macro implementations will be available from ARM’s Processor IP Division to guarantee the best in class power, performance and area results with minimum implementation effort.
Does all this sound exciting? Let me know what you think of our new GPUs in the comments section below.
*Product is based on a published Khronos Specification, and is expected to pass the Khronos Conformance Testing Process. Current conformance status can be found at www.khronos.org/conformance