For some time now I’ve been introducing myself in meetings as being responsible for future technology in graphics, video and display and finally I have something to talk about publicly. As my colleague Chris Porthouse says in his blog Is the future as good as it used to be?, ARM has just proudly launched the ARM® Mali™-DP500 Display Processor, available for licensing now, after a multi-year development.
In the bad old days, when pretty much all you had to do was read in the frame buffer from DRAM and push it out in pixel order, the job of designing a “display controller” was usually given to a junior SoC engineer to allow him to cut his teeth. The bandwidth to DRAM was OK, screen resolutions and frame rates were low, the contention on the bus was manageable, security wasn’t a concern and image processing was limited to trying to get the colours of the pixels correct. Inevitably a quick respin of the FPGA was needed when someone’s interpretation of RGB turned out to be BGR when laid out in memory, but no-one was usually fired. The software was often no more complex than a short driver that programmed a few registers, and high-level operating systems with complex GUIs were mostly unknown or irrelevant to the world of battery-powered devices.
These days, driving a 1080p screen at 60 frames per second means you are probably reading an absolute minimum of 250 Mbyte/s from DRAM to display and it could easily be a lot more. There are many other masters on the bus in a modern SoC and you can’t just rely on being able to get the bandwidth and bus priority you might want. What was originally a simple IP block has now become a relatively high-bandwidth bus master that needs to play nicely with the other masters. What was a simple controller has become a processor in its own right. On a modern smartphone, multiple image layers from a still camera, decoded digital video, 3-D graphics and other sources will need to be programmed to blend and composite together to produce the final screen image and image formats may need to be blended together and formats converted on the fly. Even phones might have multiple screens, supporting a second over HDMI or Miracast or other WiFi-based display. Those screens might display completely different images, or they might display the same image, but differently scaled. In addition, tablets and other devices are scaling to 4K screens, users are expecting the screen image to rotate when they rotate their devices, and secure display is a concern for many when dealing with financially sensitive data. Add all that to the OS integration work required that Chris talked about, and are you still happy to have the new graduate do it all? No, we didn’t think so, and that’s one of the reasons why we produced the ARM Mali-DP500 product – designing and validating a display processor (with all the associated software work) that meets modern requirements is a significant piece of work and it makes economic and schedule sense to buy in the IP rather than do all that work in-house.
The other major reason for ARM deciding to produce a display processor product is technical. When you have multiple pieces of media processing IP that are designed to work together, we can take advantage of that to improve efficiency and save power. Just like ASTC (blogged extensively) which is a texture compression method used in GPUs to save memory bandwidth, we also have ARM Frame Buffer Compression (AFBC) used across multiple pieces of IP. AFBC is a lossless compression method invented by ARM, also used to compress/decompress images to save memory bandwidth (and thus SoC power), but without losing any detail or quality whatsoever. When we launched the ARM Mali-V500 video processor, Ola Hugosson blogged about how we use AFBC in Mali-V500 internally - for reference frames, reducing bandwidth, as well as using it to produce the final image compressed. In Ola’s blog there’s a graph of how much memory bandwidth can be saved. To gain maximum advantage, you need a display processor that can read the compressed images, and decompress them in addition to all the other features described above. In the case of using an AFBC-enabled display processor, you can save hundreds of Mbyte/s decoding a 4K stream and even with modern memory systems that adds up to a significant power saving. The graph is repeated here and the difference between the green line and the red line is the use of an AFBC-enabled display processor.
Partners want all IP blocks to use a common, lossless compression format so that data can be interchanged seamlessly between them in the most power- and memory-efficient way, so obviously the new Mali-T760 GPU also supports AFBC. We have joined-up product families of CPUs, GPUs, video processors, and now display processors that can utilise common technologies across the system to save power. TrustZone™ is another technology we use across the processor families to create security and content protection solutions and we have other technologies being worked on at the moment which will increase the advantages our partners gain if they take multiple IP blocks from us, but I have gone on long enough for now…
I’ve blogged before about the way in which we have joined-up technical strategies across our product lines. Funnily enough it’s something I spend a lot of my days doing. For example, at the time of its launch, many people asked why the Mali-T600 family of GPUs was able to use more than 32-bits-worth of memory. Then ARM produced CPUs that can access it as well and they started to get it. Now of course 64-bit addressing has suddenly become the new black and we’re able to demonstrate systems with 64-bit addressing being used across the system. We’re in a much more advanced position on this than many of our competitors. Joining ARM’s IP together is much, much more than simply defining AMBA bus interconnect standards (great though that is). In this blog, I hope I’ve given you a flavour of how we work on driving advantages and optimisations from that joining-up, why the ARM Mali-DP500 will be a fantastic component of our partners’ systems and why that will help our partners make better products.
Ian, you're absolutely right. There are other advantages of having a technically joined-up solution, and security is a very important one. As we scale to very high resolution displays and higher image quality (e.g. greater bit depth), the studios are increasingly looking for their high value content to be protected from being stolen. TrustZone technologies are an increasingly important element of protection strategies and we're working hard in the SCSA to drive standards in this area.
Jem, as you note, one of the areas that the DP500 delivers is an end to end, secure path for content delivery. The studios are very interested in delivering premium content to devices around the same time it is arriving in movie theaters (I guess I should say "cinema" but I have been in the US too long). Ignoring the (non trivial) challenge of delivering extremely large files across the network (some very interesting ideas in this area), the content must simply be unable to be accessed and copied. For companies that can work out how ot do that, I think there is great revenhue opportunity ahead. ARM was announced in December as a new contributing member to the Secure Content Storage Association (SCSA) back in December