Arguably, image and signal processing is one of the fields in which the massive computational power provided by GPU Compute has the potential to bring about a change in the game. This is already truth for desktop computing, with applications such as Adobe® Photoshop™ supporting GPU acceleration, and I believe soon the same will occur for smartphones, tablets and mobile devices.
We at Synthesis Corporation, a company originally born as a primarily hardware-oriented developer of custom image processing Intellectual Property, have been aware of this trend, and have been working in close collaboration with ARM to port and optimize our image processing algorithms to the Mali™-T600 series of GPUs. Mali-T600 represents the state-of-the-art in GPU architecture for mobile devices, with support for OpenGL® ES 1.1, 2.0 and 3.0, OpenCL® Full Profile and Google® RenderScript, offering the software developer support for a wide range of APIs for graphics and compute.
As a first step, we have decided to port our proprietary Super-resolution (up-scaling) and adaptive luminance/dynamic range enhancement algorithms, both of which have seen a number of design-wins in FPGA and ASIC-based products for industrial and consumer applications, but turned out to be too computationally expensive for customers who wanted an embedded software-based solution. In this blog, we would like to present an overview of these two GPU-based solutions, and report the great performance benefits achieved through optimizing these for Mali-T600 GPU Compute.
A digital image, or a video frame, usually consists of a 2D array of discrete sample pixels, which can only be properly displayed at a fixed screen resolution. However, screen resolution tends to vary widely across platforms, especially for mobile devices, thus making sharing of digital media content across different platforms non-trivial.
The most straight-forward (and widely used) method for obtaining a higher resolution image from a given lower resolution image is to compute the values of the “missing” pixels by making use of a suitable interpolation function. However, the resulting scaled image is usually subject to artifacts and blurring, as depicted in the following figure.
In contrast, in our proprietary super-resolution scaling algorithm, the interpolation formula is adaptively modified according to the content of the image in the neighborhood of the target pixel, thus achieving sharper and artifact-free results in comparison to conventional scalers, as shown in the following samples.
All this comes at a price, and our algorithm is considerably more computationally expensive than popular interpolation scalers, such as bilinear and bicubic interpolation. Since our initial development target was to achieve realtime Full HD output on a mobile device, we have chosen to implement the software in Open GL ES for a Google Nexus™ 10 target device (featuring a Samsung® Exynos™ 5 Dual SoC, with 2x ARM Cortex™-A15 processor cores and an ARM Mali-T604 mobile GPU).
As illustrated in the chart below, we have achieved a 5x speedup when using the CPU and GPU together through RenderScript and OpenGL ES which is equivalent to roughly a Full HD 60fps output frame rate, a performance which we would never have dreamed to achieve on a mobile device just a few years ago.
HDR（High Dynamic Range）
Synthesis Corporation’s proprietary HDR software enhances the luminance of dark areas in an image without affecting other regions, unlike conventional gamma/tone-curve editing which applies the same correction uniformly to the entire image, hence over-saturating highlights.
In contrast with some HDR algorithms which use multiple source images taken at different exposures to generate a single HDR image, our algorithm uses a single source image, thus being suitable for enhancing arbitrary still-pictures and video, i.e. it does not require special camera/image sensor functionality.
A GPU-optimized version of the HDR software has been implemented in OpenCL, for the InSignal®Arndale™ development board (which features the same Samsung Exynos 5 Dual SOC which powers the Google Nexus 10 tablet). The current version of the software is capable of processing VGA video at 30fps, and we expect to further improve performance on future versions. By incorporating some algorithmic and coding optimizations for the Mali-T600 architecture, we achieved a 16x performance improvement from where we started, as it can be seen in the chart below.
About Synthesis Corporation
One of the pioneer industry-academic cooperative ventures in Japan, Synthesis Corporation’s business model is to deliver advanced semiconductor IPs (Intellectual Properties) as well as software solutions based on state-of-the-art research output from our academic members, with focus in the fields of multimedia and communications.
In recent years, Synthesis Corporation has directed a great deal of effort towards providing a comprehensive range of high-performance solutions for digital image and video processing, including real-time software-based super-resolution scalers, automatic luminance/dynamic-range enhancers for ARM CPUs and ARM Mali GPUs.