The Applied Systems Engineering (ASE) team at ARM is responsible for both implementing and presenting technical demonstrations that show ARM® technology in context. These demonstrations make their way to trade shows and other events and are also often discussed in online features such as this series of blogs. Here we introduce our Movie Vision App which explores the use of GPU Compute.
The Movie Vision app was conceived as a demonstration to highlight emerging heterogeneous computing capabilities in mobile devices. Specifically, it makes use of Android’s RenderScript computation framework. There is a great deal of discussion about GPU Compute’s capabilities and a variety of exciting new use-cases have been highlighted. On the ARM Connected Community, you can find discussions on the advantages and benefits of GPU Compute, the methods available and details on the compute architecture provided by ARM® Mali™ GPUs. The objective of the Movie Vision application is to provide a visually compelling demonstration that leverages the capabilities of ARM technology and GPU Compute, to explore the RenderScript API and to provide an example application for the Mali Developer community.
Movie Vision takes the preview feed from an Android device’s camera, applies a visual effect to that feed and displays it on the device screen. A number of visual effect filters have been implemented, each modeled on various special effects seen in popular movies over the years. Frame-by-frame, this breaks down to applying one or more mathematical operations to a large array of data – a task well suited to the kind of massive parallelism that GPU Compute provides.
In this series of blogs, we’ll go through each of the visual effect filters we implemented and use these to explore and understand the capabilities of the RenderScript API.
RenderScript is well described on the Android Developers website, an excellent place to start for details and instructions on its use. To summarize, from a developer’s standpoint, in using RenderScript you will be writing your high-performance ‘kernel’ in C99 syntax and then utilizing a Java API to manage the use of this by your application, and to manage the data going into and out of it. These kernels are parallel functions executed on every element of the data you pass into the script. Under the hood, RenderScript code is first compiled to intermediate byte-code, and then further compiled at runtime to machine code by a sequence of Low Level Virtual Machine (LLVM) compilers. This is not conceptually dissimilar to the standard Android or Java VM model, but obviously more specialized. The final machine code is generated by an LLVM on the device, and optimized for that device. On the Google Nexus 10 used for development of the Movie Vision application, RenderScript would thus make use of either the dual core ARM Cortex®-A15 CPU or the GPU Compute capabilities of the Mali-T604 GPU.
The Movie Vision app has the following structure:
- Main Android activity class
- UI layout/functionality
- Setup of camera preview
- Setup of face detection
- Set-up for sharing data with RenderScript kernels
- Callback for each camera preview frame
- Callback for rendering decorations
The functionality of the app is fairly simple. The Android Camera API provides a hook to receive a camera preview callback, each such call delivering a single frame from the camera. The main Movie Vision Activity receives this and passes the frame data to the currently selected image filter. After the frame has been processed, the resulting image is rendered to the screen. A further call back to the selected filter allows decorations such as text or icons to be rendered on top of the image. The Android AsyncTask class is used to decouple image processing from the main UI thread.
Each Image Filter shares some relatively common functionality. All of them perform a conversion of the camera preview data from YUV to RGB. The data from the camera is in YUV, but the filter algorithms and output for Movie Vision require RGB values. The Android 4.2 releases included updates to RenderScript which added an “Intrinsic” script to perform this operation. A RenderScript Intrinsic is an efficient implementation of a common operation. These include Blends, Blurs, Convolutions and other operations – including this YUV to RGB conversion. More information can be found on the Android Developer Website. Each Image Filter Java class also configures its ‘effect’ script. Configuration generally consists of allocating some shared data arrays (using the Allocation object) for input and output and allocating or setting any other values required by the script.
/** * RenderScript Setup */ mTermScript = new ScriptC_IllBeBack(mRS, res, R.raw.IllBeBack); Type.Builder tb = new Type.Builder(mRS, Element.RGBA_8888(mRS)); tb.setX(mWidth); tb.setY(mHeight); mScriptOutAlloc = Allocation.createTyped(rs, tb.create()); mScriptInAlloc = Allocation.createSized(rs, Element.U8(mRS), (mHeight * mWidth) + ((mHeight / 2) * (mWidth / 2) * 2));
Initial set up of a RenderScript kernel
This filter applies a famous movie effect with a red tint and an active Heads-Up Display that highlights faces. It is probably the simplest Movie Vision effect in terms of the RenderScript component. However, a desired additional feature of this filter highlights some challenges.
/** * Processes the current frame. The filter first converts the YUV data * to RGB via the conversion script. Then the Renderscript kernel is run, which applies a * red hue to the image. Finally the filter looks for faces and objects, and on finding one * draws a bounding box around it. * * @param data The raw YUV data. * @param bmp Where the result of the ImageFilter is stored. * @param lastMovedTimestamp Last time the device moved. */ public void processFrame(byte[] data, Bitmap bmp, long lastMovedTimestamp){ mScriptInAlloc.copyFrom(data); mConv.forEach_root(mScriptInAlloc, mScriptOutAlloc); mTermScript.forEach_root(mTermOutAlloc, mTermOutAlloc); mTermOutAlloc.copyTo(bmp); }
“I’ll Be Back” processing each camera frame
The Java component of this filter is relatively straight forward. The YUV/RGB conversion and image effect RenderScript kernels are configured. For each camera preview frame, we convert to RGB and pass the image to the effect kernel. After that, in a second pass on the frame, we render our HUD images and text if any faces have been detected. This draws a box around the face and prints out some text to give the impression that the face is being analyzed.
/* Function: root param uchar4 *in The current RGB pixel of the inout allocation. param uchar4 *out The current RGB pixel of the output allocation. */ void root(const uchar4 *in, uchar4 *out){ uchar4 p = *in; //Extracting the red channel, ignoring the green and blue channels. Creates the red 'HUE' effect. out->r = p.r & 0xff; out->b = p.g & 0x00; out->g = p.b & 0x00; out->a = 0xff; }
“I’ll Be Back” RenderScript Kernel root function
The RenderScript component is very simple. The output green and blue channels are zeroed, so that just the red channel is visible in the final image.
Initially, an attempt was made to add pseudo object detection to this effect, such that ‘objects’ as well as faces would be highlighted by the HUD. A prototype using the OpenCV library was implemented, using an included library implementation of an algorithm for Contour Detection. It is worth noting that this approach would not utilise GPU Compute and run only on the CPU. Contour Detection is a relatively complex multi-stage computer vision algorithm. First, a Sobel Edge Detection filter is applied to bring out the edges of the image. Then, a set of steps to identify joined edges (contours) is applied. The prototype object detection then interpreted this to find contours in the image that were likely to be objects. Generally, large, rectangular regions were chosen. One of these would be selected and highlighted with the HUD decorations as an ‘object of interest’.
The issue with this object detection prototype was that it required several passes of algorithmic steps, with intermediate data sets. Porting this to RenderScript to take advantage of GPU Compute would have resulted in several chained together RenderScript scripts. At the time of initial development this resulted in some inherent inefficiencies, although the addition of ‘Script Groups’ in Android 4.2 will have helped to address this. For now, porting the Contour Detection algorithm to RenderScript remains an outstanding project.
That concludes the first blog in this series on the Movie Vision App. Carry on reading with The Movie Vision App: Part 2 for examples of increasingly complex visual effect filters. Can you guess the movie that inspired the “I’ll be back” filter?
This work by ARM is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License. However, in respect of the code snippets included in the work, ARM further grants to you a non-exclusive, non-transferable, limited license under ARM’s copyrights to Share and Adapt the code snippets for any lawful purpose (including use in projects with a commercial purpose), subject in each case also to the general terms of use on this site. No patent or trademark rights are granted in respect of the work (including the code snippets).