Today we announced the ARM® Mali™-T604 GPU, the first implementation of ARM’s new Midgard architecture. The increase in screen resolutions and the demand for better-looking and more intuitive displays needs a huge increase in graphics capability. These demands for the highest levels of performance and flexibility, support for new APIs such as Khronos™ OpenCL™ and Microsoft® DirectX®, all in an energy-efficient way called for a new embedded GPU architecture...Wait. That was a bit dull. It didn’t have the why, the how, or my excitement! Let me try again:WhatAt last; it’s here. We’ve been hinting, and I’ve been bursting to tell you all about our new baby for a while now, but today is the day that we finally announce to the world our new graphics processor (GPU) originally code-named Vithar here at the ARM Technology Conference. Finally the ARM Cortex™-A15 has a companion to play with… This is the culmination of so much hard work, and I’m so proud, but first I want to give you some brief history…The BackgroundOver 5 years ago, my boss asked me to go buy a graphics company to kick-start our entry into the graphics market which was clearly ready for ARM-quality IP. We looked around and closely investigated all possibilities. The team and I eliminated all the others and settled on Falanx – a start-up company in Trondheim, Norway with great technology and superb engineers. You can see more of this story in a blog to be posted by Edvard Sørgård, one of the founders, who is still with us at ARM, designing GPUs. Even then, they had ideas on the drawing-board for a new graphics architecture, which came to be called the Midgard architecture (“Midgard” is the realm of men in Norse mythology, connected to the realm of the gods by a bridge).Since then, we’ve invested significantly, increased the size of the team hugely and have design centers working on graphics in Trondheim, the UK, Lund in Sweden, San Jose in the US and Shanghai in China. Firstly we made the Mali-200 and Mali-400MP the world’s first embedded multi-core GPU architecture, then we took the best of the graphics design ideas, added some of ARM’s CPU and cache/bus expertise into the mix and produced the world’s best embedded GPU, the ARM Mali-T604. Well, that’s my opinion, and I will explain why.Performance, performance, performance. (Why do I need a new GPU?)Today’s screen resolutions span from very small mobile phone displays through to 1080p DTVs and beyond in the home. The quest for better-looking, more informative, more intuitive displays means we are seeing huge increases in typical display resolutions and the graphics performance required to meet those demands is proportional to the number of pixels on the display multiplied by the desired frame rate. Simultaneously with that, we are seeing around a 10 fold increase in the complexity of processing done per-pixel in modern content, including games. All this amounts to greater than a 50 fold increase in graphics capability needed for next generation products and the Mali-T604 was designed to address those needs.Did I say performance? I meant scalable performanceIn addition to a demand for increased performance we are seeing a demand for scalable performance both in terms of different designs and within a particular SoC. Partners are asking for a GPU that can be scaled for different designs: feature phones, smartphones, mobile computing devices, digital home and automotive infotainment systems. Partners are licensing Mali-T604 for use in numerous segments using different numbers of cores in their multicore design.Energy-efficient performancePartners also require dynamic control over both performance and energy use. The multicore nature of the Mali-T604 design provides the capability for SoC designers to power-off cores that are not in use, enabling them to tune their energy use to a minimum. This flexibility of fine-grained control is proving to be very popular and will keep ARM at the forefront of energy-efficient visual computing.It’s system performance, stupid (and system energy-efficiency)ARM makes great IP components, but that’s not enough. Mali-T604 is even better when used with the CoreLink™ CCI-400 Cache Coherent Interconnect and the Cortex-A15 processor. It was designed to be coherent with the CPU’s caches, and this ability to snoop into its caches reduces external memory bandwidth and reduces the load on the CPU.In addition to this, Mali-T604 improves on Mali-400 which was already the world’s lowest memory bandwidth embedded GPU and reduced that bandwidth even further through advanced (and patented) technology such as improved caching, hierarchical tiling, and transaction elimination of writes to the framebuffer.The energy used to transfer data to DRAM is often as significant in the SoC as the energy used in the GPU itself. Reducing external memory bandwidth saves overall SoC power. Also, in modern battery-powered SoCs, the memory bandwidth limits are often the overall limits to real, achieved performance. It’s all aimed at making the real-world, delivered performance of Mali-powered SoCs the best in the world.Greater flexibility and new APIsThose of you who have been following my recent blogs will not be at all surprised to hear that Mali-T604 is the first graphics processor from ARM supporting GPU computing (GPGPU). The Midgard architecture was designed from the start to have extra flexibility for the new APIs and the Mali-T604 product includes an implementation of OpenCL v1.1 (full profile) that supports both ARMv7/NEON CPUs and the Mali-T604 GPU, as you’d expect from a company that sells CPUs and GPUs. This realises the potential for maximising the control and use of the resources between the CPU and GPU in a system, and is a capability that is central to GPU computing using OpenCL. You can see more details of our unique joined-up OpenCL product at my colleague Rob Elliott’s presentation at TechCon.I’ve discussed the areas in which I think GPU computing will take off in the embedded world in general in previous blogs. As systems become increasingly more complex, the ability to use all of the resources in a device becomes more important, particularly while minimizing power consumption for a mobile device. Being able to share the workload of some tasks between the CPU & GPU within a system is a feature unique to Mali-T604 amongst embedded GPUs and will I believe, enable additional resources to be used for performance intensive applications such as image processing and augmented reality. I look forward excitedly to see what cool stuff designers will do with this capability.How do they do that?The demands for the highest levels of performance and the flexibility to support new APIs such as OpenCL and Microsoft DirectX called for a new architecture. The Midgard architecture is ARM’s new architecture for the next generations of our Mali GPU family. Mali-T604 is the first implementation of the Midgard architecture, which is designed to address the demands of the evolving world of graphics and prepared to meet the challenges of using GPUs to solve other types of computational problems.
"Tri-Pipe" ArchitectureOur new shader core is the heart of the new GPU and it's really cool. Based on a radical “tri-pipe” architecture using three different types of execution pipeline within the shader core, it simultaneously addresses the demands of evolving high-performance graphics and GPU computing without compromising graphics performance or efficiency. The tri-pipe architecture delivers higher levels of performance through parallelising the issue of instructions to do the three main parts of graphics and GPU computing.The arithmetic pipeline supports full IEEE-754-2008 and has a wide range of data types from FP16, through FP32 to double-precision FP64 and all the integer types. OpenCL efficiency and performance is assured with a large number of the Built-In Function Library routines supported directly as instructions. The texture pipeline supports all the new texture formats needed for the new APIs and both it and the load/store/varyings pipeline have new features to reduce energy consumption and increase throughput in real-world memory systems.I could go on about our wonderful new GPU for ever, but I think that's enough for now. Tell me what you think... If you want a bit more, here’s a short video I made… Follow @ARMMultimedia and #ARMMali on Twitter for Mali-T604 news and updates. Also, tell us what you are looking forward to most with the Mali-T604 technology on the ARM Facebook fan page.
Interview at ARM Techcon 2010Jem is an ARM Fellow and likes to think of himself as "The Godfather" to technical talent in ARM. After spending some time in his youth writing software for satellites and traffic-lights among other fascinating things, Jem spotted the technical inflection point of the mobile industry: graphics, video and other visual processing. As VP of technology in the Media Processing Division of ARM, Jem is busy with a lot of projects involving the future of cool ARM technology, which will revolutionise how people experience and interact with digital devices.