Mali-D71 and the Next Generation Display Solution

You might remember that a few months ago we treated you to a special preview teasing our next generation display processor, then codenamed ‘Cetus’. We talked about exactly what a display processor adds to the overall graphics pipeline, and to the Mali Multimedia Suite of Graphics, Video and Display but mostly, about the huge leaps our technology has taken in this field. Well, there was a whole heap of interest way back when, so try to contain yourselves as we announce today that Cetus has officially been launched as the Mali-D71 display processor! We can now reveal that the brand-new architecture on which it was built is named ‘Komeda’, and provides the framework for incredible display technologies supporting the very latest and most complex use cases.

We talked a lot last time about the in depth technical innovations powering the massive gains compared to the previous generation and the technical changes in the architecture which allowed this. This time around then, we’re going to look at the specific performance improvements achieved, and what they enable for the end user.

So what’s new?

Mali-D71 Performance Points

First up, Mali-D71 limits the workload needing to be handled by the GPU by performing composition, in-line rotation, high quality scaling gamma/de-gamma and other advanced imaging tasks in fixed function hardware. It does this in the final stage of the multimedia pipeline, before sending the final output to the screen, meaning the GPU never has to be involved at all. In addition, these operations are all performed with a single pass through memory as opposed to multiple passes back and forth, resulting in significant system power savings. For a more specific example, downscaling a 4K video layer for a 1440p device and composing it with a complex, immersive UI graphics layer results in a total SoC power saving of 30% compared to running the same operations with GPU software.

Mali-D71 achieves twice the performance in the same area as its predecessor when operating in side-by-side mode. Unlike its predecessor, when Mali-D71 drives a single display it can reuse the resources of the secondary display. This results in doubling the number of full frame layers it can composite, rotate and scale without adding to the overall area. This means that, also within the same silicon area as the previous generation product, Mali-D71 can also offer new enhancements such as scaling split operation, AFBC encoding of uncompressed layers, faster AFBC decoding and MMU optimizations. When the Mali-D71 Display processor is implemented with the CoreLink MMU-600, launched in parallel, the integrated Translation Buffer Unit (TBU) and tight-coupling of the two via the DTI interface dramatically reduces MMU latency.

Which brings us to the next point. 4x average latency tolerance. Mali-D71 can sustain up to 4x the delay on the system bus for the same display performance compared to its predecessor, Mali-DP650. Mali-D71 adds significant optimizations to the Memory Subsystem. It doubles Outstanding Transaction capability, removes uncompressed rotation from the real-time path and converts uncompressed linear layers into AFBC1.2 tiled ones for more efficient rotation.  This is hugely important in high performance display processing where 4K frames must be fed to the output at 60-120 fps. For that to happen, the display processor needs to make the best use of the time it has on the system bus by prefetching pixels in the microseconds when the display is blank so that its buffers are always full of content. If the display has not received pixels in time, it’s starved of content and drops frames, resulting in glitches or visible artefacts on the screen and compromising the visual quality. 

Finally, Mali-D71 doubles the pixel throughput in order to achieve premium VR 4K120 performance. It does so when driving a single display in the new side-by-side operating mode that we mentioned earlier. Side-by-side mode splits the image in half, efficiently using both sets of resources to process half the image each, whilst only powering one display output. For 4K60 workloads and below, side-by-side mode can be used to half the clock frequency, enabling lower voltages and saving power. For 4K120 workloads, side-by-side is mandated - in essence doubling the pixel throughput for the same target frequency. Without side-by-side mode it’s only possible to reach 4K60, so the ability to halve the frame processing by performing it in parallel means you can halve the power, or double the performance.

So what’s the point?

Well, if you’ve followed any of our previous releases you’ll know that, as we don’t actually make anything physical ourselves, it can take a little time for our products to hit silicon, let alone an actual device. This means we always have to be (at least) one step ahead of the trends in order to plan for them in our products and are always watching for the latest tech disrupting the status quo. There are few specific trends emerging in the display industry that have a big impact on the performance points and features we needed to target with our new Komeda display architecture, and with the overall solution we’ve put together too. A colleague of mine outlined these trends in a whole lot more detail recently, and you can check that out here, so I’m just going to talk about how our new Display Solution addresses these challenges.

  1. High Dynamic Range

The first emerging must-have in display is High Dynamic Range (HDR). HDR content has been coded across a broader dynamic range to incorporate greater subtlety of colour and contrast. Essentially this makes the dark areas on an image darker and richer, and the lighter areas crisper, cleaner and with better saturation, avoiding that washed out appearance you sometimes see in images with bright sunlight, for example. More and more content creators are using HDR to provide the best possible viewing experience, but this is a waste of time if you can’t display it properly. Mali-D71 works with Assertive Display 5 to take HDR content from all your favourite providers, like Netflix and Amazon Video, and display it in full HDR quality on any type of panel, even if it’s SDR. Mali-D71 itself takes the HDR video and the graphical UI overlay and blends them together into one single frame coded in standard gamma with full colour gamut, then sends it to Assertive Display 5 to convert in to the correct colour range for an SDR display. 

This means that all the hard work the content creators put into designing their artwork in HDR10 format would be wasted if you viewed it with a normal display processor. With Mali-D71 you can recreate the same awesome HDR quality even on much lower specification displays, retaining the artistic intention of the content.

SDR HDR composition

  1. VR

As we’ve discussed many times before, mobile VR presents a technical challenge, to put it politely. Meeting the real-time latency and throughput requirements, not to mention the pixel quality required when they’re right in front of your face, pushes the display processor, and the rest of the system, to their very limits. This is where the latency tolerance we spoke about earlier comes in, as well as the power saving and performance boosting capabilities of side-by-side mode. However, when you add in the new CoreLink MMU-600 the Mali-D71 really comes into its own. The way in which the MMU-600 optimises memory subsystem allows the Mali-D71 to tolerate such long delays in the system bus by making the most of the memory subsystem available to drive the highest performing VR displays up to 4K120fps.

  1. Multi-Window Display

As we use our phones for more and more, it’s inevitable that we begin to see a need to complete several tasks at once. Where once we would sit at our desktop with half the screen showing the webinar we were ignoring while the other half showed emails or your Facebook timeline, we now expect to be able to enjoy the same levels of multitasking (or procrastination) on mobile. This means the display subsystem has to work even harder in order to deliver these different activities simultaneously. Previous generations of display processor could handle up to 4 layers, whereas Mali-D71 has doubled that capability to deliver up to 8 Android composition layers in single display mode. Coupled with the ability to split your screen, this means the Mali-D71 can handle your UI, navigation bars, status info, as well as a couple of totally separate apps, without breaking a sweat.

multi window composition

  1. Screens Screens Screens

As with many things in the technology industry, consistency is king. It’s really hard to adjust your apps and games to work across multiple platforms and the same is true for display panels. There is such a vast array of technologies, performance points, not to mention ages, of display panels out there, that it can be really hard to know what information the display processor might require from the panel, and vice versa, in order to work at its best. This is where Arm’s awesome ecosystem of partners comes in. By working with various experts across the industry we can target the widest variety on panel vendors and ensure our display solutions are capable of taking the available information and optimizing content for the best possible viewing experience, no matter the panel.

The display ecosystem often gives us unprecedented access to experts we can collaborate with to deliver the best possible experience.

VR and high-dynamic range use cases require WQHD+ and 4K resolutions along with 90/120 frame rates, bringing new power, cost and time-to-market challenges to the consumer electronics market. Arm’s Mali-D71 display processor along with Synopsys’ silicon-proven DesignWare MIPI DSI Host Controller IP with VESA DSC encoder and MIPI D-PHY IP deliver a complete display solution, ensuring a seamless integration of these key IP components into the application processor for different modes of data transfer and display panel characteristics.

Hezi Saar, Senior Product Marketing Manager at Synopsys

Mali-D71 can drive unprecedented pixel throughputs of up to 4K120 for new display-based mobile products, such as AR/VR headsets. With Mali-D71 and Hardent’s VESA DSC combined solution, you can reduce the transmission bandwidth by up to 3X with visually lossless compression, enabling a more immersive VR experience within a given mobile power budget.

Alain Legault, VP IP Products at Hardent

Arm and Analogix are actively working together to define a protocol needed to optimize the workload on both the application processor and the display driver IC for HMD VR/AR applications, with the ultimate goal of enabling an optimal solution in terms of performance, cost, and power consumption through the entire AR/VR system.

Ning Zhu, CTO of Analogix

The Arm Complete Display Solution

Today we have delivered the first complete Arm Display Solution to support all the latest use cases across the next generation of high end devices. Whilst excellent standalone products, the greatest benefits are seen when all three products are implemented together to achieve the very peak of those performance points we’ve talked about. With a brand new architecture, plus the pre-optimized software stack and integrated technologies of CoreLink MMU-600 and Assertive Display 5, there’s no doubt we’ll be powering dazzling displays in devices to come.

Graphics & Multimedia blog