In the fast-paced technology world we are used to hearing about improvements from one product generation to the next. In the past, we have looked in great detail at the various different metrics used to compare GPU performance for graphics and compute use cases. This time we want to celebrate the recent release of the Samsung Galaxy Note 3 and the Samsung Galaxy Note 10.1 (2014) based on the Samsung Exynos 5 Octa (5420) platform with ARM® Mali™-T628 and the fact that “performance takes a huge step forward” when compared to previous devices on the market. But we also want to take this opportunity to highlight energy-efficiency optimisations and double check if we have delivered on our promise of a 50% improvement. So in this blog we will look more closely into the best practices for benchmarking the performance in battery and area constrained devices and when comparing improvements in the energy efficiency.
Let’s start with the three golden rules of benchmarking GPUs for energy efficiency.
Let’s imagine two devices with two different form factors. One of them has a 720p screen while the other is equipped with an ultra high definition 2.5K screen. In each frame the latter device has to process four times more pixels then the former one (see graph below). This means that in order to deliver the same performance it has to provide four times more throughput and potentially consume a proportionally higher amount of energy. That explains why most of the industry standard benchmarks tend to use the off-screen buffers with fixed resolution (for instance 1080p in the case of GLBenchmark) and are therefore able to provide an apples-to-apples comparison for devices of different form factors.
To render a single frame of given content, a GPU has to process a specific amount of data and consume a given amount of energy. In the typical use case it will be required to deliver 60 frames per second for any content visible on the screen. Even if the GPU is capable of running faster than 60 fps the frame rate will be capped by the screen refresh rate of 60Hz. As we pointed out earlier, typical graphics benchmarks will often use off-screen buffers to compare performance at the same screen resolution - this enables tests to be running at a frame rate beyond 60fps and allows devices to be compared at their top-end performance.
Modern mobile devices enable different use cases with diverse graphics requirements, particularly when it comes to the complexity of the content being processed. Obviously a GPU has to do much less to process a single frame of user interface or a casual game then it would when running a high-end game or a graphics benchmark designed to stress test the graphics system. Industry standard 3D graphics benchmarks provide a good indication of what we could expect from the AAA class content. However, we also have to look into the test cases that match more causal use cases i.e. playing Fruit Ninja or scrolling through the Android™ UI. In the past we covered why metrics such as triangles per second or pixels per second don’t necessarily map into the real-life balanced use case, and why it is always important to actually run the application that is representative for the use cases we want to characterise.
Taking into account the above considerations on performance, screen resolution and use case it’s much easier to realise why average power on its own is not a useful metric when comparing how a given GPU performs within a given power budget. As it stands, this metric ignores performance and resolution and without any further context it says nothing about the efficiency of the GPU.
If we had to construct a metric that takes those factors into account we would have to provide an Average Power for a given Performance at a given Screen Resolution running a given Use case. Ignoring the excessive use of upper case in the above sentence, clearly we would need to simplify it a little bit - and we have two choices:
Leaving behind the academic debate on the advantages of joules per task or even joules per pixel over performance per watt, in both of the case above we are talking about exactly the same GPU and we have just chosen to present it in two different ways.
Second generation of Mali-T600 series delivers 50% energy efficiency improvement… and then some
Up to now this article was far from the promised celebration of the market leading Samsung devices with Mali-T628 and it’s time to change that… So how did we do with the second generation of the Mali-T600 series and how do we compare to other devices on the market? For that we will compare the performance and energy efficiency between a 2012 tablet with the Mali-T604 and a 2013 one with the Mali-T628 MP6.
As you can see we not only delivered the 50% energy efficiency improvement, but in reality we achieved more than a 100%. Obviously the semiconductor fabrication used to create a System on a Chip is constantly improving and new process nodes introduce higher clock speeds and lower power consumption. So we have to take into account manufacturing processes improvements as well as benefits of using ARM Artisan™ Physical IP that enables efficient implementation of such complex SoC designs. But even with that we can safely assume that we have delivered our promised improvement in the Mali-T620 series.
Also it is worth bearing in mind that GPU energy efficiency is only half of the story, ARM big.LITTLE™ multi-processing technology and the ARM Cortex™-A series of processors deliver high performance and efficiency across the entire system. Additionally, memory bandwidth also contributes to the entire system power. Mali GPUs with Job Manager, Hierarchical Tiler, Transaction Elimination, Adaptive Scalable Texture Compression and other upcoming features enable additional multiple end-to-end savings that result in ARM-based systems with longer battery life and lower thermal budget.
So to summarise, Mali-T628 GPU has delivered even more than we promised, and in the next few blogs we are going to explain how ARM technology leadership and Mali GPUs further reduce system power.
I shouldn't be surprised by now, but with each generation, it is still amazing to me to see the graphics improvements. When I see the same video/ game/ app play on a device a generation before (which in devices is often less than a year), the improvements make my eyes dance.
And exciting new products keep coming out on the mali-t604 devices. Another Samsung Electronics Samsung Exynos Series by Samsung Electronics, the HP Chromebook was reviewed by AnandTech | HP Chromebook 11 Review
So I continue to be delighted by the innovative products that are based on Mali.
Thanks for the blog Jakub.