Mucho GPU Compute, amigo!

I have just returned from my third Mobile World Congress (MWC) in Barcelona and I have no doubt this year’s has been for me the best one so far! We are continuing to see a strong trend in independent third party companies using ARM® Mali™ GPU Compute to improve their products and this year we were not just showing demos, we were showing mature solutions on shipping hardware, delivering measured benefits to end users. In case you weren’t at MWC this week, below are some examples of the great GPU Compute solutions that were on show at the event.

Mature implementations of VP9 and HEVC: products, not just demos

As announced last week, Ittiam Systems demonstrated their HEVC and VP9 implementations accelerated using ARM Mali-T600 GPU Compute technology. It is for over a year now that Ittiam has been supporting OpenCL™ on Mali GPUs in their HEVC codec and their Mali-optimized VP9 implementation (which is built on the same core technology) has been showcased since last year. The tight partnership between ARM and Ittiam has enabled us to optimize our device drivers for this type of workload which enables improved performance and reduced energy and bandwidth consumption for all users. At this year’s event we were able to demonstrate the maturity of both HEVC and VP9 by using GPU Compute. 1080p30 content was shown to decode comfortably on previous generation SoCs, such as the ones that have been shipping in devices as early as 2012. Just imagine the benefits you will get when using the latest Mali GPUs!

Watch Mukund Srinivasan, General Manager for Consumer and Mobility Business at Ittiam, talking about how leveraging on ARM Cortex® CPU with ARM NEON™ technology and ARM Mali GPU Compute has enabled them to improve the performance and energy efficiency of their VP9 and HEVC implementations:

At the ARM Booth we showcased a demo kindly supplied to us by the YouTube team, where we streamed real, live VP9 encoded content on a Samsung Galaxy NotePRO 12.2 tablet.


Thanks to the collaboration with partners like Ittiam Systems, VP9 can be enabled with the additional benefits of Mali GPU Compute. Quoting Matt Frost, Senior Business Product Manager, WebM Project: "With VP9, ARM Mali-based devices will allow people to watch YouTube in high definition at half the bandwidth currently used."

As mentioned in the past, many codec vendors in the industry are now working on HEVC implementations and are optimizing them for Cortex CPUs and Mali GPUs. Examples include Squid Systems, PixTree, VisualOn, ArcSoft and more. At MWC 2014, Aricent, a global innovation and services company, was also demonstrating their implementation which has been accelerated using ARM Mali GPU Compute.


Camera applications and real time video processing and analysis

Pre and post-processing of still and moving images is also a great use case for GPU Compute. At last year’s event we extensively demonstrated the benefits of GPU Compute acceleration with our partners Synthesis Corporation and MulticoreWare. As the ecosystem of partners around GPU Compute continues to grow, more solutions are becoming available.

Alva Systems provides a super-resolution solution that makes use of interpolation and texture enhancement technology combined with a smart colour compensation technique. This was optimized for ARM Mali GPUs and supports both the OpenGL® ES and the OpenCL APIs. Alva also implemented an advanced image stabilization solution that in addition to minimizing the shake and motion effect when taking video (pre-encode), can also be used to correct a pre-encoded video stream via real time post processing during video playback. Both super-resolution and stabilization solutions support at least 1080p30 thanks to GPU Compute support. Both these solutions have been demonstrated at MWC14 on an ARM Mali-T628 MP6 GPU-based device.

Watch Angela Wei, VP of Sales and Marketing at Alva Systems, discussing this in the following video:

ThunderSoft is another ARM Mali partner, they are a software and services company headquartered in China. One of their products is UCam, a very popular turn-key camera solution shipping in over 50m units (downloads and pre-loads). It features over 60 real time image processing effects operating at full frame, optimized for NEON and GPU processing. ThunderSoft understands the importance of GPU Compute acceleration and has been collaborating with ARM on the optimization of some of the UCam image processing filters using RenderScript on on Android. At MWC we were able to demonstrate real time “manga effect” filters applied to live feed on a Google Nexus 10 tablet. The standard algorithm implementation (with no GPU offload) only achieves a handful of frames per second, and even this fully loads both CPUs. Converting the filters to RenderScript enabled the bulk of the processing to be offloaded to the GPU. We recorded CPU load reduction of over 40%, whilst performance improved many fold to enable real time use of the application.

Seth Bernsen, President of ThunderSoft America, talks about the partnership with ARM and the merits of using GPU Compute:

We see a lot of excitement in the ecosystem around the many partners developing software based computer vision applications. In addition to gesture user interface, at this year’s event we also hosted an advanced face detection and analysis solution implemented by PUX (Panasonic). This technology has been optimized for GPU Compute since last year and was shown publically for the first time at the Embedded Technology 2013 conference in Yokohama. We were excited to showcase an improved implementation at the MWC 2014. The demo supports up to 20 faces being detected at the same time and detects gender, age, facial expression and eye gaze. Imagine this used for profiling people viewing a shop window.


Beyond Mobile

Our key theme for this event was that ARM technology deployment goes beyond mobile and is present everywhere, from sensors to servers. So too is GPU Compute on ARM Mali GPUs.

A year ago we pre-viewed a prototype by Aptina, this year we were able to show to our customers an entire ISP pipeline from raw, high resolution, interlaced HDR sensor data all the way to rendering to screen - all of this running on the GPU. As discussed in my presentation at the Electronics Imaging event a few weeks ago, a significant amount of driver optimizations now enable this kind of application on existing SoCs. The computational load required for this type of ISP work, in the absence of hardware ISP, cannot be handled by the CPU on its own. GPU Compute enables this use case.

We also demonstrated gesture UI improved using GPU Compute. This is a fantastic example of applied machine vision. Our partner, eyeSight Technologies, is a leader in this field and has collaborated with ARM for a long time to improve the robustness and performance of their gesture detection solution using OpenCL on ARM Mali GPUs. At this event we extended the scope of this application beyond driving a DTV UI control to also showcase how GPU Compute improves in-car UI. The challenge for many gesture detection solutions is their use in poor lighting conditions. eyeSight demonstrated that with the additional compute power of the GPU you can significantly improve the robustness and accuracy of gesture detection.


Another one of our partners, AccelerEyes, produce software libraries and tools for GPU Compute. At last year’s SC13 event they were able to showcase a port of their ArrayFire HPC maths library accelerated using OpenCL on an ARM Mali-T600 GPU, and this was again available to see at the MWC this week. Check-out Scott Blakeslee’s blog here.

The flexibility and scalability of our architecture enables us to target a variety of use cases, from sensor to server. We also support Full Profile and 64-bit natively, in hardware. After years of evangelising the benefits of such an approach it is nice to see other players in the industry join down this avenue. Similar solutions to what we have been pioneering for some time are starting to appear accross the market; many of these were shown or announced at MWC 2014. It is rewarding to see many industry players being inspired by ARM Mali GPU Compute and the innovative efforts of the ARM ecosystem.

I have been overwhelmed by the request for meetings and the positive feedback from our partners and customers and I look forward to future events where we can continue to showcase the great work that our partners and ecosystem are doing through leading the industry around GPU Compute applications on ARM Mali GPUs.

Graphics & Multimedia blog