Skip navigation


1 2 3 Previous Next

ARM Mali Graphics

305 posts

In this blog series we’ll be looking at the current status of Virtual Reality with a special focus on mobile VR, where it’s heading in the future and what ARM® can do to help us get there. We’ll be covering the common challenges and pitfalls and how these can be avoided to produce a great VR experience. We’ll also keep you up to date with other blogs and discussions around technical tips, tutorials and FAQ’s; so stay tuned for the future of mobile VR.


Where VR is now

Virtual Reality isn’t new, people have been talking about it since the 90s, so why has the industry never quite taken off in the way we might expect? The quick answer is that the technology simply wasn’t there. The hardware was prohibitively expensive and very bulky and the graphics capabilities were too limited to produce a successful VR user experience - unless you consider motion sickness a success. Now however, lower cost hardware based on existing platforms is changing the game, with mobile platforms offering console-like performance. Not only that, but existing mobile devices already contain many of the sensors VR requires, from gyros to accelerometers, opening up a whole world of mobile VR possibilities.


What’s next for VR

The Virtual Reality industry has a forecast worth of US$30 billion by 2020, and that all has to come from somewhere. Fig1.jpg

Fig.1 Digi-Capital VR Revenue Forecast


Gaming is of course a huge industry and a high-end, immersive gamer experience can now be literally at your fingertips. Mobile VR allows you to become fully involved in your chosen game at home, work, or while trying to escape the monotony of public transport; but that’s not all VR can do. Researching a university assignment can be a chore, but how about if you could visit the most relevant museums or seminars without having to leave the dorm? VR allows us to see exhibitions in world class museums and galleries without an expensive trip to London, Paris, or anywhere else. Shopping too, isn’t everyone’s favourite pastime, especially around the Christmas rush. Wouldn’t it be great if you could wander the aisles and compare options for your next car, sofa, TV or even pair of shoes, without tripping over pushchairs or being upsold by pushy assistants? All this is possible with the huge technical advances in VR and it’s only a matter of time until this is our standard way of working.



Fig.2 nDreams® Perfect Beach experience allows you to get away from it all without leaving the sofa


So how does VR actually work?

Technology is the key to VR success and this blog series will talk about exactly what you need to make it happen. VR comes in mobile or desktop options, but according to Oculus® Co-founder Palmer Luckey, desktop VR is seriously compromised by the requirement for a ‘cable servant’ to follow the user around preventing trip hazards. So mobile VR is the quickest way forward, and the simplest of the mobile options allows you to simply slot your smartphone into the headset and get started. The headset provides you with a stereoscopic display, with two marginally different images rendered for the left and right eye, allowing the user to experience depth. Barrel distortion is then applied to the rendered images in post processing to counteract the curvature of the lenses.



Fig.3 Marginally different images for each eye allow the perception of depth and barrel distortion applies curvature to the image to counteract the curvature of the lens


Finally, sensors in the device detect the movement of your head and adjust the scene in real time to render the updated view to the headset and allow realistic visual feedback. Going forward, additional sensors will facilitate live hand-tracking for a truly immersive experience, and this can be combined with the use of an inbuilt or add-on controller to allow you to interact fully with your virtual surroundings.


VR Optimisation with Mali GPUs

As with any emerging technology, there are issues that can stand in the way of a truly successful VR user experience. These include low resolution blurring the image and compromising visual quality, or a low frame rate making the display appear stilted or jerky. A major issue experienced when developing for VR is latency, or the time it takes for the on-screen image to catch up with the user’s head movement, and this is one of the key causes of sickness or dizziness in VR users.

The ARM® Mali™ GPU family is the world’s #1 licensable GPU in terms of shipments and is perfectly positioned to deliver an optimum VR experience. Mali GPU architecture enables high resolution and power saving through various features such as Adaptive Scalable Texture Compression (ASTC); and ARM Frame Buffer Compression (AFBC) dramatically reduces system bandwidth, with performance fully scalable across multiple cores. Mali support for extensions to OpenGL ES and EGL reduce latency and improve overall performance.


What we’re doing now

At events like VRTGO ARM recently demonstrated  how great a mobile VR experience can be with the Mali-based Samsung® Gear VR headset, a collaboration from Samsung Mobile and Oculus. The first version was based on the Galaxy Note 4, with the second generation now available for the Galaxy S6, both powered by the Mali-T760. The Ice Cave Demo, featuring Geomerics Enlighten global illumination in collaboration with RealtimeUk; was easily ported to VR on the Samsung Gear VR headset; read about how we did this here.

Superior visuals and a smooth user experience are all possible in mobile VR and throughout this blog series we’ll be discussing the common challenges surrounding developing for VR and how ARM’s technology and Mali GPUs can help you overcome them.


Stay tuned for more on mobile VR!

It's been a while since my last performance blog, but one of those lunchtime coffee discussions about a blog I'd like to write was turned into a working tech demo by wasimabbas overnight, so thanks to him for giving me the kick needed to pick up the digital quill again. This time around I look at 2D rendering, and what OpenGL ES can do to help ...


A significant amount of mobile content today is still 2D gaming or 2D user-interface applications, in which the applications render layers of sprites or UI elements to build up what is on screen. Nearly all of these applications are actually using OpenGL ES to perform the rendering, but few applications actually leverage the 3D functionality available, preferring the simplicity of a traditional back-to-front algorithm using blending to handle alpha transparencies.




This approach works, but doesn’t make any use of the 3D features of the hardware, and so in many cases makes the GPU work a lot harder than it needs to. The impacts of this will vary from poor performance to reduced battery life, depending on the GPU involved, and these impacts are amplified by the trend towards higher screen resolutions in mobile devices. This blog looks at some simple changes to sprite rendering engines which can make the rendering significantly faster, and more energy efficient, by leveraging some of the tools which a 3D rendering API provides.


Performance inefficiency of 2D content


In 2D games the OpenGL ES fragment shaders are usually trivial – interpolate a texture coordinate, load a texture sample, blend to the framebuffer – so there isn’t much there to optimize. Any performance optimization for this type of content is therefore mostly about finding ways to remove redundant work completely, so that the shader never even runs for some of the fragments.


The figure in the introduction section shows a typical blit of a square sprite onto a background layer; the outer parts of the shield sprite are transparent, the border region is partially transparent so it fades cleanly into the background without any aliasing artifacts, and the body of the sprite is opaque. These sprite blits are rendered on top of what is in the framebuffer in a back-to-front render order, with alpha blending enabled.


There are two main sources of inefficiency here:


  • Firstly, the substantial outer region around this sprite is totally transparent, and so has no impact on the output rendering at all, but takes time to process.
  • Secondly, the middle part of the sprite is totally opaque, completely obscuring many background pixels underneath it. The graphics driver cannot know ahead of time that the background will be obscured, so these background fragments have to be rendered by the GPU, wasting processing cycles and nanojoules of energy rendering something which is not usefully contributing to the final scene.


This is a relatively synthetic example with only a single layer of overdraw, but we see real applications where over half of all of the rendered fragments of a 1080p screen are redundant. If applications can use OpenGL ES in a different way to remove this redundancy then the GPU could render the applications faster, or use the performance headroom created to reduce the clock rate and operating voltage, and thus save a substantial amount of energy. Either of these outcomes sounds very appealing, so the question is how can application developers achieve this?


Test scene


For this blog we will be rendering a simple test scene consisting of a cover-flow style arrangement of the shield icon above, but the technique will work for any sprite set with opaque regions. Our test scene render looks like this:




… where each shield icon is actually a square sprite using alpha transparencies to hide the pieces which are not visible.


Tools of the trade


In traditional dedicated 2D rendering hardware there are not usually many options to play with; the application has to render the sprite layers from back to front to make sure blending functions correctly. In our case the applications are using a 3D API to render a 2D scene, so the question becomes what additional tools does the 3D API give the applications which can be used to remove redundant work?


The principal tool used in full 3D scene rendering to remove redundant work is the depth test. Every vertex in a triangle has a “Z” component in its position, which is emitted from the vertex shader. This Z value encodes how close that vertex is to the camera, and the rasterization process will interpolate the vertex values to assign a depth to each fragment which may need fragment shading.  This fragment depth value can be tested against the existing value stored in the depth buffer and if it is not closer1 to the camera than the current data already in the framebuffer then the GPU will discard the fragment without ever submitting it to the shader core for processing, as it now safely knows that it is not needed.


Using depth testing in “2D” rendering


Sprite rendering engines already track the layering of each sprite so that they stack correctly, so we can map this layer number to a Z coordinate value assigned to the vertices of each sprite which is sent to the GPU, and actually render our scene as if it has 3D depth. If we then use a framebuffer with a depth attachment, enable depth writes, and render the sprites and background image in front-to-back order (i.e. the reverse order of normal blitting pass which is back-to-front) then the depth test will remove parts of sprites and the background which are hidden behind other sprites.


If we run this for our simple test scene, we get:




Uh oh! Something has gone wrong.


The issue here is that our square sprite geometry does not exactly match the shape of opaque pixels. The transparent parts of the sprites closer to the camera are not producing any color values due to the alpha test, but are still setting a depth value. When the sprites on a lower layer are rendered the depth testing means that the pieces which should be visible underneath the transparent parts of an earlier sprite are getting incorrectly killed, and so only the OpenGL ES clear color is showing.


Sprite geometry


To fix this issue we need to invest some time into setting up some more useful geometry for our sprites. We can only safely set the depth value when rendering front-to-back for the pixels which are totally opaque in our sprite, so the sprite atlas generation needs to provide two sets of geometry for each sprite. One set, indicated by the green area in the middle image below, covers only the opaque geometry, and the second, indicated by the green area in the right image below, picks up everything else unless it is totally transparent (in which case it can be dropped completely).




Vertices are relatively expensive, so use as little additional geometry as possible when generating these geometry sets. The opaque region must only contain totally opaque pixels, but the transparent region can safely contain opaque pixels and totally transparent pixels without side-effects, so use rough approximations for a "good fit" without trying to get "best fit". Note for some sprites it isn’t worth generating the opaque region at all (there may be no opaque texels, or the area involved may be small), so some sprites may consist of only a single region rendered as a transparent render. As a rule of thumb, if your opaque region is smaller than 256 pixels it probably isn't worth bothering with the additional geometry complexity, but as always it's worth trying and seeing.


Generating this geometry can be relatively fiddly, but sprite texture atlases are normally static so this can be done offline as part of the application content authoring process, and does not need to be done live on the platform at run time.


Draw algorithm


With the two geometry sets for each sprite we can now render the optimized version of our test scene. First render all of the opaque sprites regions and the background from front-to-back, rendering with depth testing and depth writes enabled. This results in the output below:




Any area where one sprite or the background is hidden underneath another sprite is rendering work which has been saved, as that is an area which has been be removed by the early depth test before shading has occurred.


Having rendered all of the opaque geometry we can now render the transparent region for each sprite in a back-to-front order. Leave depth testing turned on, so that sprites on a lower layer don't overwrite an opaque region from a sprite in a logically higher layer which has already been rendered, but disable depth buffer writes to save a little bit of power.


If we clear the color output of the opaque stage, but keep its depth values, and then draw the transparent pass, we can visualize that the additional rendering added by this pass. This is show in the figure below:




Any area where one of the outer rings is partial indicates an area where work has been saved, as the missing part has been removed by the depth test using the depth value of an opaque sprite region closer to the camera which we rendered in the first drawing pass.


If we put it all together and render both passes to the same image then we arrive back at the same visual output as the original back-to-front render:




... but achieve that with around 35% fewer fragment threads started, which should translate approximately to a 35% drop in MHz required to render this scene. Success!


The final bit of operational logic needed is to ensure that the depth buffer we have added to the scene is not written back to memory. If your application is rendering directly to the EGL window surface then there is nothing to do here, as depth is implicitly discarded for window surfaces automatically, but if your engine is rendering to an off-screen FBO ensure that you add a call to glInvalidateFramebuffer()  (OpenGL ES 3.0 or newer) or glDiscardFramebufferEXT() (OpenGL ES 2.0) before changing the FBO binding away from the offscreen target. See Mali Performance 2: How to Correctly Handle Framebuffers for more details.




In this blog we have looked at how the use of depth testing and depth-aware sprite techniques can be used to accelerate rendering using 3D graphics hardware significantly.


Adding additional geometry to provide the partition between opaque and transparent regions of each sprite does add complexity, so care must be taken to minimize the number of vertices for each sprite otherwise the costs of additional vertex processing and small triangle sizes will out-weigh the benefits. For cases where the additional geometry required is too complicated, or the screen region covered is too small, simply omit the opaque geometry and render the whole sprite as transparent.


It’s worth noting that this technique can also be used when rendering the 2D UI elements in 3D games. Render the opaque parts of the UI with a depth very close to near clip plane before rendering the 3D scene, then render the 3D scene as normal (any parts behind the opaque UI elements will be skipped), and finally the remaining transparent parts of the UI can be rendered and blended on top of the 3D output. To ensure that the 3D geometry does not intersect the UI elements glDepthRange() can be used to limit the range of depth values emitted by the 3D pass very slightly, guaranteeing that the UI elements are always closer to the near clip plane than the 3D rendering.


Tune in next time,



[1] Other depth test functions are possible in OpenGL ES, but this is the common usage which is analogous to the natural real world behaviour.



Pete Harris is the lead engineer for the Mali GPU performance analysis team at ARM. He enjoys spending his time working on a whiteboard to determining how to get the best out of complex graphics sub-systems, and how to make the ARM Mali GPUs even better.

This blog was written by Kapileshwar Syamasundar during his summer placement at ARM in the ARM Mali Graphics demo team. Kapil did some great work at ARM porting the Ice Cave demo to VR using Unity, we hope you can benefit from this too.



Ice Cave VR


1. Introduction

Ice Cave, the latest demo from ARM Mali Ecosystem, has been shown with great success this year in such major events as GDC, Unite Europe, and Unite Boston. The demo has been developed in Unity and aims to demonstrate that it is possible to render high visual quality content on current mobile devices. A number of highly optimized special effects were developed in-house, specifically for this demo, some of which are based on completely new techniques, for example the rendering of shadows and refractions based on local cubemaps.


Figure 1 View of the cave from the entrance in the Ice Cave demo.


The Ice Cave demo was released at a time when Virtual Reality has become the centre of attention in the game development community, and related events and media. A number of VR demos and games have already been released but VR performance requirements can limit the complexity of VR content and therefore the visual quality of the final VR experience.

It is in this landscape that the Ecosystem demo team decided to port the Ice Cave demo to Samsung Gear VR and this task was assigned to me. In this blog I describe my experience in porting the Ice Cave demo to VR during my eight weeks summer placement in the Ecosystem demo team.

By the time I joined the demo team, Unity had just released a version with VR native support for Oculus Rift and Samsung Gear VR.  Previously, VR support was only available by means of a plugin based on Oculus Mobile SDK, but this had some obvious limitations:

  • Each VR device has a different plugin
  • Plugins may conflict with each other
  • Release of newer VR SDKs / Runtimes can break older games
  • Lower level engine optimizations are not possible with plugin approach of two separate cameras

Conversely, the newly released Unity VR native integration lacked both support and sufficient information for developers, and experienced many unresolved issues. Nonetheless, the team was convinced that with the native integration in Unity we would be able to achieve the best possible performance; a key point in guaranteeing a successful VR user experience.


2. Samsung Gear VR

The Samsung Gear VR headset does not have a built in display but has instead been designed to host a mobile phone. At the time of writing, the Samsung Gear VR comes in two versions; one for Samsung Note 4 and another for the latest Samsung Galaxy S6. Some of the main specifications of the Samsung Galaxy S6 version are listed below.


·           Sensors: Accelerator, Gyrometer, Geomagnetic, Proximity



Figure 2. The Samsung Gear VR for Samsung Galaxy S6.

·           Motion to Photon Latency < 20ms

·           Manual Focal Adjustment

·           Main Physical UI: Touch Pad

·           Oculus’s Asynchronous TimeWarp technology

Samsung Gear VR is powered by Oculus VR software and incorporates the Oculus Asynchronous Time Warp technology. This important feature helps reduce latency, or the time taken to update the display based on the latest head movement; a key issue to avoid in VR devices. Besides the Time Warp technology, the Samsung Gear VR has several sensors which it uses in place of the ones incorporated in the phone.

The Samsung Gear VR has its own hardware and features a touch pad, back button, volume key and, according to the specifications, an internal fan designed to help demist the device while in use.

The key point here however, is that you can insert your Samsung Galaxy S6 into the headset and enjoy an immersive experience with just a smartphone. We are no longer limited to the screen size of the phone and can instead become completely immersed in a virtual world.


3. Main steps to port an app/game to VR in Unity

VR integration in Unity has been achieved following one of the main Unity principles, that it must be simple and easy. The following basic steps are all that are needed to port a game to VR:

·         Unity 5.1 version with VR native support (or any higher version).

·         Obtain the signature file for your device from the Oculus website and place it in Plugins/Android/assets folder.

·         Set the “Virtual Reality Supported” option in Player Settings.

·         Set a parent to camera. Any camera control must set camera position and orientation to the camera parent.

·         Associate the camera control with the Gear VR headset touch pad.

·         Build your application and deploy it on the device. Launch the application.

·         You will be prompted to insert the device into the headset. If the device is not ready for VR you will be prompted to connect to the network where the device will download Samsung VR software.

·         NB. It is useful to set the phone to developer mode to visualize the application running in stereo without inserting into the Gear VR device. You can enable the developer mode only if you have installed previously a VR application appropriately signed.


Enabling Gear VR developer mode

•       Go to your device Settings - Application Manager - Gear VR Service

•       Select "Manage storage"

•       Tap on the "VR Service Version" six times

•       Wait for scan process to complete and you should now see the Developer Mode toggle

Developer mode allows you to launch the application without the headset and also dock the headset at any time without having Home launch.

Figure 3. Steps to enable VR Developer mode  on Samsung Galaxy S6.




Figure 4 Side by Side view of stereo viewports captures with VR developer mode enabled.



4. Not as simple as it seems. Considering VR specifics.

After following the instructions above, I saw nothing but a black screen when inserting the device into the headset. It took me some time to get the VR application running in order to establish that some existing features had to be changed and others added.

VR is a completely different user experience and this is therefore one of the key issues when porting to VR. The original demo had an animation mode which moved the camera through different parts of the cave to show the main features and effects. However, in VR this animation caused motion sickness to the majority of users, particularly when moving backwards. We therefore decided to remove this mode completely.

We also decided to remove the original UI. In the original Ice Cave demo a tap on the screen triggers a menu with different options but this was unsuitable for VR.  The original navigation system, based on two virtual joysticks, was also unsuitable for VR so we decided to entirely replace it with a very simple user interaction based on the touch pad:

·         Pressing and holding the touch pad moves the camera in the direction the user looks.

·         When you release the pressure the camera stops moving.

·         A double tap resets the camera to the initial position.

This simple navigation system was deemed to be intuitive and easy by all users trying the VR version of the demo.


Figure 5. User interaction with touch pad  on the Samsung Gear VR.


The camera speed was also a feature we considered carefully as many users experienced motion sickness when the camera moved just a little too fast. After some tests we were able to set a value that most people were comfortable with.

Additionally, the camera has to be set as a child of a game object. This is the only way Unity can automatically integrate the head tracking with the camera orientation. If the camera has no parent this link will fail so any translation and rotation of the camera has to be applied to the camera parent node.

In VR, as in reality, it is important to avoid tight spaces so the user doesn’t feel claustrophobic. The original Ice Cave was built with this in mind and provides ample space for the user.

The only effect not imported to VR was the dirty lens effect. In the original Ice Cave demo this effect is implemented as a quad that is rendered on top of the scene. A dirty texture appears with more or less intensity depending on how much the camera is aligned with the sun. This didn’t translate well to VR and so the decision was made to completely remove it from the VR version.


Figure 6. Dirty lens effect implemented in the original Ice Cave demo.


5. Extra features in the Ice Cave VR version

In the original demo the user can pass through the walls to look at the cave from the outside. However in VR this didn’t create a good experience and the sensation of embedding disappeared when you went out of the cave. Instead, I implemented camera collision detection and smooth sliding for when the user moves very close to the walls.

When running a VR application on Samsung Gear VR, people around the user are naturally curious about what the user is actually seeing. We thought that it would be interesting, particularly for events, to stream the content from the VR headset to another device such as a tablet. We decided to explore the possibility of streaming just the camera position and orientation to a second device running a non-VR version of the same application.

The new Unity network API allowed a rapid prototyping and in a few days I had an implementation which worked pretty well. The device actually running the VR version on the Samsung Gear VR works as a server and in each frame sends the camera position and orientation over wireless TCP to a second device that works as a client.


Figure 7. Streaming camera position and orientation from Samsung Gear VR to a second device.


Using the built-in touch pad to control the camera motion proved very successful. Nevertheless, we decide to provide the user with an alternative method of control using an external Bluetooth mini controller readily available elsewhere. This required us to write a plugin to extend the Unity functionality by intercepting the Android Bluetooth events and using them to trigger movement and resetting of the camera. Unfortunately there is not much information available so whilst it was only possible to intercept the messages coming from two keys , this was enough to move/stop and reset the camera.




6. Conclusions

Ice Cave VR was implemented during my summer placement with ARM’s Ecosystem Demo team in less than eight weeks with no previous experience of Unity. This was possible thanks to the native VR integration Unity released on version 5.1. In principle, just a few steps are necessary to port a game to VR, although in practice you need to do some extra work to fine-tune the specific requirements of VR in your game. With this integration, Unity has greatly contributed to the democratisation of VR.

Unity VR integration is still in progress and some reported issues are expected to be solved in coming versions. Nonetheless, the Ice Cave VR version shows that it is possible to run high quality VR content on mobile devices if resources are balanced properly at runtime by using highly optimized rendering techniques.

All the advanced graphics techniques utilised in the Ice Cave demo are explained in detail in the ARM Guide for Unity Developers. In the guide it is possible to find the source code or code snippets of these techniques which allowed me to understand how they work.

What I consider the most relevant in all this is the fact that with mobile VR we are no longer limited to the size of our smartphones to enjoy a game. Now we can be part of a limitless virtual world and enjoy a wonderful VR experience from a tiny smartphone inserted in a head set. This really is an outstanding step forward!


Now you can check out the video

ARM Game Developer Day 1500x640.jpg


Be part of the future of mobile game technology. Bring along your project challenges for expert advice on how to maximise performance of your game for mobile platforms. Learn about the latest ARM CPU and Mali GPU architecture, multicore programming and tile-based mobile GPUs and see how to implement highly optimized rendering effects for mobile.


The day will feature:

  • VR showcase area with live demos
  • Q&A Clinics and the sharing of best coding practice for mobile platforms
  • Talks on mobile VR, Geomerics advanced applications of dynamic global illumination plus a panel discussion on “the future of mobile game technology”
  • Open discussions with ARM experts about your own project challenges and suggested improvements
  • The opportunity to network with top mobile game developers and engineers
  • Free hot lunch and refreshments throughout the day


ARM-DevDay-london-225x140.jpgLondon - ARM Game Developer Day


Thursday 3rd December 2015

9:00am - 6:00pm

Rich Mix, 35-47 Bethnal Green Rd, London E1 6LA


See full agenda and register here to make sure you don’t miss out!



ARM-DevDay-chengdu-225x140.jpgChengdu - ARM Game Developer Day


Wednesday 16th December 2015

9:00am - 6:00pm

iTOWN Coffee, Building A, Chengdu Tianfu Software Park, China


See full agenda and register here too!

Vertex interleaving


Recently we were asked via the community whether there was an advantage in either interleaving or not interleaving vertex attributes in buffers. For the uninitiated, vertex interleaving is a way of mixing all the vertex attributes into a single buffer. So if you had 3 attributes (let’s call them Position (vec4), Normal(vec4), and TextureCoord(vec2)) uploaded separately they would look like this:


P1xP1yP1zP1w , P2xP2yP2zP2w , P3xP3yP3zP3w ... and so on

N1xN1yN1zN1w , N2xN2yN2zN2w , N3xN3yN3zN3w ... and so on

T1xT1y , T2xT2y , T3xT3y ... and so on


(In this case the commas denote a single vertex worth of data)

The interleaved buffer would look like this:


P1xP1yP1zP1w N1xN1yN1zN1w T1xT1y , P2xP2yP2zP2w N2xN2yN2zN2w T2xT2y ,

P3xP3yP3zP3w N3xN3yN3zN3w T3xT3y ... and so on


(Note the colours for clarity)


… Such that the individual attributes are mixed, with a given block containing all the information for a single vertex. This technique is what the stride argument in the glVertexAttribPointer function and its variants is for, allowing the application to tell the hardware how many bytes it has to jump forwards to get to the same element in the next vertex.


However, even though we all knew about interleaving, none of us could really say whether it was any better or worse than just putting each attribute in a different buffer, because (to put it bluntly) separate buffers are just easier to implement.


So in a twist to usual proceedings I have conferred with arguably the top expert in efficiency on Mali,  Peter Harris. What follows is my interpretation of the arcane runes he laid out before my quivering neurons:


Interleaving is better for cache efficiency…


… Sometimes.


Why does interleaving work at all?


The general idea behind interleaving is related to cache efficiency. Whenever data is pulled from main memory it is loaded as part of a cache line. This single segment of memory will almost certainly contain more than just the information desired, as one cache line is larger than any single data type in a shader program. Once in the local cache the data in the loaded line is more quickly available for subsequent memory reads. If this cache line only contains one piece of required information, then the next data you need is in a different cache line which will have to be brought in from main memory. If however, the next piece of data needed is in the same cache line, the code can fetch directly from the cache and so performs fewer loads from main memory and therefore executes faster.


Without getting into physical memory sizes and individual components, this can be illustrated thusly:


Imagine we have 3 attributes, each of them vec4s. Individually they look like this:


| P1 P2 | P3 P4 | ...

| N1 N2 | N3 N4 | ...

| T1 T2 | T3 T4 | ...


From this point forward those vertical lines represent the boundaries between cache lines. For the sake of argument, the cache lines in this example are 8 elements long, so contain 2 vec4s; but in the real world our cache lines are 64 bytes, large enough to hold four 32-bit precision vec4 attributes. For the sake of clear illustration I’ll be keeping the data small in these examples, so if we want all the data for vertex number 2 we would load 3 cache lines from the non-interleaved data:


P1 P2

N1 N2

T1 T2


If this data is interleaved like so:


| P1 N1 | T1 P2 | N2 T2 | P3 N3 | T3 P4 | N4 T4 | ...


The cache lines fetched from main memory will contain:


T1 P2

N2 T2


(We start from T1 because of the cache line alignment)


Using interleaving we've performed one less cache line fetch. In terms of wasted bandwidth, the non-interleaved case loaded 3 attributes which went unused, but only one unused attribute was fetched in the interleaved case. Additionally, it's quite possible that the T1 P2 cache line wouldn't need to be specifically fetched while processing vertex 2 at all; if the previously processed vertex was vertex 1, it is likely that the data will still be in the cache when we process vertex 2.


Beware misalignment


Cache efficiency can be reduced if the variables cross cache line boundaries. Notice that in this very simple example I said the texture coordinates were vec4s. Ordinarily textures would be held in vec2 format, as shown in the very first explanation of interleaving. In this case, visualising the individual elements of the buffer, the cache boundaries would divide the data in a very nasty way:


PxPyPzPw NxNyNzNw | TxTy PxPyPzPw NxNy | NzNw TxTy PxPyPzPw | …


Notice that our second vertex's normal is split, with the x,y and z,w in different cache lines. Though two cache lines will still contain all the required data, it should be avoided as there is a tiny additional power overhead in reconstructing the attribute from two cache lines. If possible it is recommended to avoid splitting a single vector over two cache lines (spanning a 64-byte cache boundary), which can usually be achieved by suitable arrangement of attributes in the packed buffer. In some cases adding padding data may help alignment, but padding itself creates some inefficiencies as it introduces redundant data into the cache which isn’t actually useful. If in doubt try it and measure the impact.


But it's not always this simple


If we look at the function of the GPU naïvely, all of the above makes sense, however the GPU is a little cleverer than that. Not all attributes need to be loaded by the vertex processor. The average vertex shader looks something like this:


     uniform vec4 lightSource;

     uniform mat4 modelTransform;

     uniform mat4 cameraTransform;


     in vec4 position;

     in vec4 normal;

     in vec2 textureCoord;

     in vec2 lightMapCoord;


     out float diffuse;

     out vec2 texCo;

     out vec2 liCo;


     void main( void ){

          texCo = textureCoord;

          liCo = lightMapCoord;

          diffuse = dot((modelTransform*normal),lightSource);




If you look at our outputs diffuse is calculated at this stage, as is gl_Position, but texCo and liCo are just read from the input and passed straight back out without any computation performed. For a deferred rendering architecture this is really a waste of bandwidth as it doesn’t add any value to the data being touched. In Midgard family GPUs (Mali-T600 or higher) the driver understands this (very common) use case and has a special pathway for it. Rather than load it in the GPU and output it again to be interpolated, the vertex processor never really sees attributes of this type. They can bypass the vertex shader completely and are just passed directly to the fragment shader for interpolation.


Here I've used a second set of texture coordinates to make the cache align nicely for this example. If we fully interleave all of the attributes our cache structure looks like this


PxPyPzPw NxNyNzNw | TxTy LxLy PxPyPzPw | NxNyNzNw TxTy LxLy | ...


Here the vertex processor still needs to load in two attributes P and N, for which the cache line loads will either look like:


PxPyPxPw NxNyNzNw


… or …


TxTy LxLy PxPyPzPw | NxNyNzNw TxTy LxLy


… to obtain the required data, depending on which vertex we are loading. In this latter case the T and L components are never used, and will be loaded again separately to feed into the interpolator during fragment shading. It’s best to avoid the redundant data bandwidth of the T and L loads for the vertex shading and the redundant loads of P and N when fragment shading. To do this we can interleave the data into separate buffers, one which contains all of the attributes needed for computation in the vertex shader:


PxPyPzPw NxNyNzNw | PxPyPzPw NxNyNzNw | PxPyPzPw NxNyNzNw | ...


… and one containing all of the attributes which are just passed directly to interpolation:


TxTy LxLy TxTy LxLy | TxTy LxLy TxTy LxLy | TxTy LxLy TxTy LxLy | ...


This means that the vertex shader will only ever need to touch the red and green cache lines, and the fragment interpolator will only ever have to touch the blue and orange ones (as well as any other interpolated outputs from the vertex shader). This gives us a much more efficient bandwidth profile for the geometry processing. In this particular case it also means perfect cache alignment for our vertex processor.


A note on data locality


Caches function best when programs make use of the data in the same cache lines in a small time window. This maximizes the chance that data we have fetched is still in the cache and avoids a refetch from main memory. Cache lines often contain data from multiple vertices which may come from multiple triangles. It is therefore best practise to make sure that these adjacent vertices in memory are also nearby in the 3D model (both in terms of attribute buffers and index buffers). This is called data locality and you normally need look no further than your draw call's indices (if you are not using indexed models you have far bigger problems than cache efficiency to solve). If the indices look like this:


(1, 2, 3) (2, 3, 4) (3, 4, 5) (4, 5, 2) (1, 3, 5) ...


You have good data locality. On the other hand, if they look like this:


(1, 45, 183) (97, 12, 56) (4, 342, 71) (18, 85, 22) ...


… then they're all over the place and you’ll be making your GPU caches work overtime. Most modelling software will have some kind of plugin to better condition vertex ordering, so talk to your technical artists to get that sorted somewhere in the asset production process.


To maximize the cache efficiency it’s also worth reviewing the efficiency of your vertex shader variable types, both in terms of sizes and number of elements. We see a surprising amount of content which declares vector elements and then leaves many channels unused (but allocated in memory and so using valuable cache space); or which uploads highp fp32 data and then uses it in the shader as a medium fp16 value. Removing unused vector elements and converting to narrower data types (provided the OES_vertex_half_float extension is available) is a simple and effective way to maximize cache efficiency, reduce bandwidth, and improve geometry processing performance.


So there you have it, interleaving vertex attributes. It would be remiss of me to tell you to expect immediate vast performance improvements from this technique. At best this will only cleave back a tiny bit of efficiency but in large complex projects where you need to squeeze as much as possible out of the hardware, these tiny improvements can all add up.


Thanks again to Peter Harris, who provided a lot of the information for this blog and also was kind enough to go through it afterwards and take out all my mistakes.

This blog post refers to the public ARM Mali Midgard r6p0 user-space binary drivers for GNU/Linux which are now available for download.  The "fbdev" variant now has support for dma-buf using a standard ioctl, which is explained in detail here.


What is wrong with fbdev?


The Linux kernel defines a user-space API that lets applications control displays via frame buffer drivers, also known as "fbdev" drivers.  This is achieved via a set of file operations on "/dev/fb*" devices such as ioctl or mmap, which allows direct access to the pixel data of a display.  While this is a simple, widespread, lightweight and powerful interface, it is no longer suitable in the world of modern embedded graphics computing.


For example, it does not provide any means to share the frame buffer directly with another kernel driver such as a GPU driver.  When running an application that renders graphics with a GPU, each frame contains pixels that the GPU generated within its own buffers.  The display controller is a separate entity with its own frame buffer located elsewhere in memory.  So the standard way of displaying the GPU frames in fbdev mode is for the user-space (EGL part of the driver) to copy the data from a GPU buffer to the display controller's frame buffer.  This is done on the CPU, typically by calling memcpy.  This works, but can easily become the biggest bottleneck in the system and is therefore not acceptable in a real product.


There are many other limitations to "fbdev" which are not covered in this blog post, such as managing layers, synchronising applications with the display controller or performing 2D composition in hardware.


What can we do to fix it?


All new display controller drivers should be implementing the Direct Rendering Manager (DRM) API instead of fbdev.  DRM offers a much more modern set of features and solves many of the problems seen with fbdev.  However, there are still a lot of GPU-accelerated products that control their display using an fbdev Linux driver.  Work needs to be done on each of these to avoid the unacceptably inefficient CPU memcpy and to let the GPU directly write into the frame buffer.


A typical way of achieving this is by using the dma-buf framework in the Linux kernel.  Typically, the frame buffer driver registers itself as a dma-buf exporter and implements a way for user-space to get a file descriptor for this frame buffer (an ioctl).  The user-space driver then passes the dma-buf file descriptor to the GPU kernel driver to import and use.  When rendering happens, the GPU pixels are directly written into the frame buffer using a hardware DMA operation - this is also referred to as "zero-copy" and is much faster than calling memcpy.  However, there is no standard ioctl in the fbdev API to export the frame buffer with dma-buf so each fbdev driver has a slightly different one.  This means the user-space needs to be modified to work with each fbdev driver, which is not compatible with the public standard Mali binary drivers as they must not depend on platform specific display drivers.




This is why, starting with r6p0, we are adding a more generic ioctl implementation to export the dma-buf file descriptor defined as a custom extension to the standard fbdev API in supported kernels: the FBIOGET_DMABUF ioctl.  This way, no extra dependency is being added on the user-space binary.  If the ioctl is not available on a given kernel, then the user-space will revert to memcpy.  We have already enabled this in our ODROID-XU3 Linux kernel branch on Github.


We should keep this added functionality in all our new fbdev Mali Midgard binary drivers and Linux kernels for supported platforms, such as ODROID-XU3, Chromebook devices with an ARM Mali GPU, Firefly...

Yesterday, the Media Processing Group at ARM announced a new highly-efficient graphics processing unit (GPU), the ARM® Mali™-470, to enable smartphone-quality visuals on wearable and IoT devices:


ARM Mali-470 GPU Offers Improved Efficiency and Experiences on Wearable and IoT Devices


A growing market with unique challenges


The wearables market has been growing steadily for many years with more and more devices and applications entering the market. ARM has long been associated with wearables, with many devices based on ARM technologies and more recently with the “Wearables for good” challenge in partnership with UNICEF and frog. That association now extends to graphics processing with the Mali-470.


Mali-470 is the latest in the Mali-400 series of graphics processors that run applications using the ubiquitous OpenGL® ES 2.0 graphics standard. The Mali-400 family has shipped in more than a billion devices worldwide and is favoured where efficient graphics processing is a must. An example being the growing number of System-on-Chips (SoCs) that are designed specifically for wearable and IoT applications, such as Mediatek’s MT2601 SoC, announced earlier this year in support of Google’s Android Wear software:


MediaTek Introduces MT2601 in Support of Google’s Android Wear Software


The key advantage of Mali-470 is that it consumes half the power of the Mali-400 GPU, helping device manufacturers bring the smartphone user experience to environments with even greater power-constraints.


Expanding the smartphone user experience



We’ve all become accustomed to high-quality visuals, backed by a touchscreen, as the most intuitive way of interacting with our smartphones and tablets. When we use other types of device, we want to interact with them in a very similar way.


For those of us who remember Video Cassette Recorders, and the frustration of trying to program the timer for the first time, it’s hard to imagine anyone tolerating that kind of user experience ever again. Yet, across many devices, the user interface quality has fallen far behind that of our smartphone.


From watches to thermostats, industrial control panels in factories and warehouses, multi-function printers in offices, infotainment systems in cars and home appliances, highly efficient graphics processing is essential to render intuitive user interfaces.


The challenge many of these devices face is power consumption and how to reduce it as the interface becomes more sophisticated – we think Mali-470 is the answer.


Why OpenGL ES 2.0?


Every pixel matters in delivering high-quality user interfaces. This is especially true for smaller screens where every pixel must play a role in conveying information clearly or providing intuitive controls or both.


The majority of Android™, Android Wear and other emerging operating systems, such as Tizen™, use OpenGL ES 2.0 for modern user interfaces, mapping, casual gaming, etc. OpenGL ES 2.0 offers the ideal balance between per-pixel control with programmable shaders and energy-efficiency. Mali-470 uses the same industry-standard OpenGL ES 2.0 driver stack as the Mali-400 GPU so there is no need to re-optimise existing applications – anything written for Mali-400 will work seamlessly on the Mali-470 GPU.


More recent versions of OpenGL ES have introduced a number of additional features to support immersive video games; however the OpenGL ES 2.0 feature level is the most efficient for user interfaces that appear on wearable and IoT devices.



Half the power consumption


Building on the success of the Mali-400 GPU, Mali-470 delivers the same rich performance at the same process geometry while halving the power consumption. This provides SoC manufacturers with scalable options to enable them to create embedded graphics subsystems that meet the needs of new low-power devices.




Mali-470 achieves this by building on the energy-efficiency gained in Mali-450 and applying focussed design changes to the Vertex and Fragment Processors. This results in half the power consumption with the same performance when compared to the Mali-400. Vertex processors construct the “wire frame” of a scene and the fragment processors perform the per-pixel shading, colours and effects such as transparency. For wearable device resolutions a single fragment processor is sufficient, but Mali-470 has the ability to scale to four fragment processors to support the higher resolutions of devices with larger screens.


processor-diagram.jpgMali-470 block diagram: Up to 4 pixel processors can be implemented and this multi-core

design supports screen resolutions from 640x640 to 1080p at 60FPS 32bpp


The design improvements in Mali-470 can be grouped into three areas of equal importance: Quad-thread scheduling, Microarchitectural and Datapath optimisations.


Quad-thread scheduling optimisations:

  • Enforcing the grouping of quads (2x2 pixel threads) so that the frequency of control and state updates within the pipelines are significantly reduced.
  • Optimising many of the functional blocks to operate on quads.
  • Centralising a subset of per-quad state and accessing it only when necessary, rather than clocking it through the pipelines

Microarchitectural optimisations:

  • Making aggressive use of clock-gating throughout the design, including clock-gating of all function-orientated L1 caches
  • Bypassing functional blocks whenever instruction execution can proceed without them

Datapath optimisations:

  • Optimising datapaths to make targeted use of fixed-point arithmetic, rather than floating-point arithmetic for vertex processing


Wearables and beyond…


Designed for wearables and IoT devices, the Mali-470 GPU will benefit a multitude of devices that require a rich UI and where energy-efficiency is important, especially when coupled with ARM CPUs such as the Cortex®-A7 and A53 processors. You can see some of the possibilities below:




To summarise, the Mali-470 graphics processor further expands the smartphone experience into a wider range of devices including wearables, home gateways and appliances, industrial control panels, healthcare monitors and even new entry-level smartphones.


With half the power consumption of the billion selling Mali-400 GPUs, Mali-470 opens the door for more vibrant user interfaces and provides exciting opportunities for designers to innovate with graphics in even more power-constrained environments. We expect to see Mali-470 appearing in first devices from early 2017.

A report from AnandTech about ARM's new Mali-470 GPU:

ARM Announces Mali-470 GPU: Low Power For Wearables & More

Hello all,


My name is Dale Whinham, and I’m an intern within the Media Processing Group at ARM. I have been working with ARM over the summer to produce some additional sample code for the Mali SDK, which is freely downloadable for your platform of choice over at our SDKs section here:


In this blog post, I wanted to talk a little bit about my experiences with the Mali OpenGL ES Emulator, which saw some significant updates recently, as detailed by lorenzodalcol in this previous blog post:


I am very new to OpenGL (ES) development, so currently I rely fairly heavily on good debugging tools to help me understand where I might be going wrong. As a newcomer to graphics development, I learned fairly quickly that staring at a black screen for several hours is normal and okay when you are just starting out, as there are quite a lot of things you need to do to prepare your OpenGL context for rendering, such as allocating vertex buffer objects, filling them with data, compiling shaders, getting handles to their attributes and uniforms, and so on. This in itself can be quite overwhelming at first, especially when it’s difficult to see what’s going on inside OpenGL, because you’re cross-compiling native code for an Android device, and debugging options are more limited.


One of the challenges I was faced with was that I struggled to get debugging to work properly for native code in Eclipse with Android Development Tools (ADT).


So, a bit of context: at the time of writing, Google have now deprecated support for Eclipse with ADT, in favour of their new Android Studio IDE – which is great news for Android Java developers, as the IntelliJ platform makes for a very rich and stable IDE, but not-so-great news for developers writing C/C++ code, as Android Studio’s native code support is still in its early stages at the time of writing. Until the tools mature, Eclipse with ADT is still relied on by many developers to write and build native code for Android.


As such, I just couldn’t get the Eclipse debugger to co-operate and set breakpoints on my native Android code. I found myself spending more time on StackOverflow trying to find solutions to problems with Eclipse than I did actually writing the code, and so I started looking for another strategy!


I was made aware of the Mali OpenGL ES Emulator, which is a comprehensive wrapper library for OpenGL that allows you to write code for the OpenGL ES API, but have it run on a desktop computer with desktop OpenGL. This would allow me to work on my project on the desktop, get it working the way I wanted, and then move it back into Eclipse and rebuild for Android later. The Mali Linux SDK actually comes with Microsoft Visual Studio project files, and you can build and run the samples for Windows if you have the Mali OpenGL ES Emulator installed. I decided to migrate my project to Visual Studio 2015 so that I could design and debug it on the desktop more easily, though I could have also chosen to use Linux-based tools, as the Mali OpenGL ES Emulator provides a Linux version too.


The installation procedure is quite straightforward. There are two flavours of the Mali OpenGL ES Emulator to download – 32bit or 64bit – and you’ll need to install the version corresponding to your build target, i.e. whether you’re compiling for 32bit or 64bit. You can, of course, install both if you’re building for both architectures, but beware of mixing up the ”bitness” – if your app is compiled for 64bit but tries to load the 32bit emulator DLLs, it may crash.


Once installed, configure your project to search for headers within the “include” directory inside the Mali OpenGL ES Emulator’s installation folder – e.g. for the 64bit version on Windows, for me it was C:\Program Files\ARM\Mali Developer Tools\Mali OpenGL ES Emulator 2.2.1\include (see Figure 1). This folder contains header files for EGL and OpenGL ES 2 and 3 as well as their extensions.

fig1.pngFigure 1: Setting additional include directories in Visual Studio. Note the semicolon is used to add multiple directories.


Additionally, configure your project to add the installation folder to your list of linker search directories, so it can link against the wrapper libraries (see Figure 2):


Figure 2: Setting additional library directories in Visual Studio.

Once you’ve done this, you’re pretty much ready to go. On Windows, the Mali OpenGL ES Emulator installer sets your system’s PATH environment variables so that your compiled application will find the OpenGL ES libraries correctly at runtime. You can now begin writing code as if it were for a mobile GPU by including the OpenGL ES headers in your source code, and calling OpenGL ES functions as normal.

Figure 3 shows a screenshot of the Mali OpenGL ES emulator in action, showing a simple 3D scene from one of my work-in-progress code samples. The code sample has some glue code to give me a Windows GUI window, but the rendering context and draw calls are all EGL and OpenGL ES – wrapped seamlessly to desktop OpenGL by the Mali OpenGL ES Emulator:


Figure 3: A simple 3D scene being rendered in OpenGL ES using the Mali OpenGL ES Emulator

In addition to being able to use the powerful Visual Studio debugger for my C++ code, a major benefit of the OpenGL ES Emulator is that I can stack desktop OpenGL debuggers on top of it.

For instance, what if I wanted to check the geometry of my 3D models with a wireframe view? Well, in desktop OpenGL I could just use glPolygonMode() with GL_LINE as the mode parameter, but in OpenGL ES we don’t have this function and so we would have to write a shader.

Alternatively I could use the force wireframe feature of an OpenGL debugger. Enter GLIntercept (, a powerful open-source OpenGL debugger for Windows that comes with a multitude of features, including (but not limited to) run-time shader editing, the ability to freely move the camera, texture/framebuffer dumping, and wireframe rendering. By placing its special OpenGL32.dll into the same directory as our application executable, along with a configuration file that lets us pick the debugging features we’d like to enable, it intercepts all calls to OpenGL, allowing us to tweak the behaviour of OpenGL before it gets forwarded to the GPU driver.

In Figure 4, we can see that same scene again, but with GLIntercept enabled, forcing wireframe on, and allowing me to see the geometry of my 3D objects without having to change the code of my project:


Figure 4: The same 3D scene using the wireframe debugging feature of GLIntercept

This is just the tip of the iceberg of what is possible with the Mali OpenGL ES emulator. It supports many OpenGL extensions such as ASTC texture compression, and extensions from the Android Extension Pack – a collection of extensions found in Android 5.0+ that gives you many advanced rendering capabilities, making it a powerful addition to your development tools. With a reasonably-specced graphics card in your PC, you can save a lot of time developing applications that use these features by eliminating the process of loading your code onto a development device from your workflow – at least in the earlier stages of development, when recompiling and testing may be quite frequent.

For more information about the Mali OpenGL ES Emulator, check out our product page over here:


Make sure you grab the PDF User Guide from the download link too, for a comprehensive manual that gives you all the technical details about the emulator, including system requirements for the host PC, supported extensions, texture formats and much more!

Achieving the icy wall effect in the Ice Cave demo

Ice Cave is ARM’s latest demo. It shows that great graphical quality can be obtained in mobile devices with the use of conventional highly optimised rendering techniques. The demo is visually stunning and full of life: there are reflections on the board, refractions within the phoenix statue and there is a feeling of time passing as the light changes with the sun moving, coupled with the patrolling tiger and the fluttering butterfly. All these elements are immediately evident, but there are a few others that are more subtle but add another layer of dynamism to the demo, such as the reflective icy walls of the cave. The purpose of this blog is to explain how we achieved this effect within the Ice Cave in a way that is easy to understand for everyone, in order for developers to be able to replicate the technique themselves, as this is a very performance-efficient technique that works wells in mobile.


  1. Fig. 1: The Ice Cave [effect in the video can seen at 3:00 or by clicking here:]

Ice is a tricky material to replicate due to its reflective and refractive properties. The way it refracts light gives it a particular hue of blue that is a bit difficult to pinpoint and texture effectively without having the asset appear excessively blue or completely washed out. Light scatters off ice in a certain way depending on the surface of the ice itself, which could mean the reflections on the surface can be anything from relatively clean to completely distorted and unrecognisable. On an environment such as the Ice Cave, one would expect to get uneven, relatively unclean reflections on the walls due to their irregular nature, and if you look closely when panning around the demo, you can see it is there. This effect is the result of a long effort in investigating and trying to achieve a parallax effect that made the ice appear thick and reflective.

I originally wanted a parallax effect on the walls, but previous attempts had failed to produce the kind of effect that we were after. The idea for the technique that is currently found in the Ice Cave originated after an accidental switch between the world space normals and tangent space normals. I noticed that the switch resulted in a strange, distorted effect on the walls that was highly reflective. These sorts of reflections were the type that we were after, but we needed to have control over them. Using that thought as an inspiration, I started looking into how to localise that reflective effect only to certain areas of the cave walls in order to get the parallax effect we were after.


Ice Cave_wall2.png

  1. Fig. 2: Close up of the reflective icy walls [effect in the video can seen at 3:16 or by clicking here:]

The initial map used to produce the reflections was an object space normal map with sections that were greyed out, and as such contained no normal information (Fig. 3). Even though it worked, it was a complicated process to tweak the normal map information as it had to be done in Photoshop by manually adding and removing sections of the texture as we needed. That was when I had the first thought of using two separate maps in order to interpolate between them to obtain the reflections.


modified normal map1.png

  1. Fig. 3: The first modified normal map

The two main elements of the polished technique are the object space normal maps and the greyscale normal maps (Fig, 4). The white areas of the grey map remain unaffected and as such take on the information provided by the object space normal maps. It is the combination of the two which produces the icy, parallax-like reflections on the walls of the cave.


modified normal map paired.png


  1. Fig. 4: Object space normals on the left and final greyscale normals on the right

The greyscale normals are made by removing the colour from the tangent space normal maps (fig. 5). This produces as a result a picture with small tone variation in the grey, in such a way that most of the values are in the range of 0.3 - 0.8.


Tangent space normals paired.png


  1. Fig. 5: Tangent space normals and resulting greyscale normals

It is important to use the tangent space normal maps in order to produce a greyscale map, as the colour variation is minimal in them which will mean that, once the colour is removed, you will be left with a map that very clearly shows all the rough detail of the surface. On the other hand, if you use of the object space normal maps you will get an image that shows where the light hits as well as the rough detail due to the contrasting colours. (Fig. 6)


modified normal map Grey bk Greyscale paired.png


  1. Fig. 6: Object space normals to the left and the resulting greyscale normals to the right,
    on which the areas where the light hits is very evident

The grey normals are to only cause reflection on the walls of the cave, not on the snow. Therefore both the diffuse map and the greyscale normal map have to match, so that wherever there is white in the diffuse map the grey normal map is transparent, and where there is black in the diffuse map the grey normal map is opaque (fig. 7).


modified normal map Diffuse texture paired.png


  1. Fig. 7: Diffuse texture on the left and final greyscale normal maps. The transparent areas on the normal map match the black ice areas on the diffuse texture.

The grey normals are then combined with the true normals using a value proportional to the transparency value of the greyscale normals:

half4 bumpNormalGrey= lerp(bumpNorm, bumpGrey, amountGreyNormalMap);

As a result, in the dark, rocky parts of the cave, the main contribution will come from the greyscale normals and in the snowy part from the object space normals, which produces the effect we are looking for.

At this point the normals all have the same components with values between 0.3-0.8. It means the normals are pointing in the direction of the bisect of the first octant, as the normals have the same component values: (0.3, 0.3, 0.3), … , (0.8, 0.8, 0.8)

The shader then applies a transformation that you normally use to transform values in the interval [0, 1] to the interval [-1, 1]: 2 * value – 1. After applying this transformation part of the resulting normals will point to the bisect of the first octant and the other part to the opposite direction.

If the original normal has the components (0.3, 0.3, 0.3) then the resulting normal will have the value (-0.4, -0.4, -0.4). If the original normal has the components (0.8, 0.8, 0.8) then the resulting normal will have the value (0.6, 0.6, 0.6). So now the normals are pointing to two main opposite directions. Additionally when the reflection vector is calculated the built in reflect function is used. This function expects the normal vector to be normalized but what we are passing is a non-normalized normal.

As the normals are not normalized the length is less than 1. The reflect function is defined as:

R = reflect(I, N) = I – 2 * dot(I, N) * N

When you use the built in reflect function with a non-normalized normal with a length less than 1 the resulting reflection vector will have an angle relative to the provided normal higher than the incident angle.

The normal switching direction when the greyscale map has values around 0.5 means that totally different parts of the cube map will be read whenever the greyscale value changes between lower than, or higher than 0.5, creating the effect where there are uneven spots reflecting the white parts of the cube map right next to where it reflects the rocky parts. Since the areas of the greyscale map that gives switches between positive and negative normals is also the area that gives the most distorted angles on the reflected vectors, this gives makes the white reflected spots very uneven and distorted, giving the desired “swirly” effect.

If the shader outputs only the static reflection using non-normalized and clamped grey normals we get an effect like the one shown below in figure 8:


Ice Cave Swirl Effect.png

  1. Fig. 8: This is the swirl effect that results in the icy effect that everybody likes

The clamping seems to be relevant to the result of the icy effect as it produces normals oriented mainly in two opposite directions, which is the main factor that defines the swirl-like pattern. However if we remove the clamping of the greyscale normals, the result produces normals in one main direction then we get the effect shown in figure 9, which results in a different visual effect:


Ice Cave Slanted band pattern.png

  1. Fig. 9: The removal of the clamp results in a slanted band pattern which is much more evident when the camera moves

The use of two normal maps is not the only thing that influences the reflections on the icy walls. Reflections in the Ice Cave are obtained through the use of local cubemaps, which is a very effective and low-cost way of implementing reflections in a scene. These cubemaps have local corrections applied to them, which ensure that reflections behave in a realistic way and changes as expected as one looks around the cave. Local correction is needed because the cave is a limited environment, which means the reflections inside of it should behave differently than those caused by an infinite skybox. Having the local correction makes the effect appear realistic; without it the reflections remain static and give the illusion that the geometry is sliding over the image instead of the reflections being within the geometry. There is no feeling of depth or feeling that the ice is thick without it.

More information as to how local cubemaps work can be found in this blog:

It was an interesting journey to try and understand the workings behind this effect, as it was achieved through relatively unconventional means. The in-depth study helped us understand how the two normals behave together, and what exactly causes the result displayed in the demo.  The icy wall reflections are very effective, inexpensive and perform as desired: making the cave seem as it is actually made of thick ice.

When creating an animation, it is paramount to have a very clear objective and vision of the asset and its function. How much is the object going to deform? How visible is it going to be in the scene? How complex is the overall animation? These are some questions to ask as you are making the asset. Outlining and understanding important aspects of the task can be the difference between smooth progress and a difficult, rocky workflow. Within this blog, I am hoping to address issues that people might encounter when animating and give advice as to how to address them. The main software used for the examples in the blog are Autodesk Maya and Unity, however the theories behind the workflow and habits are applicable to any 3D engine and game engine out there.

Asset Production:

It is important to understand the role of the asset within the scene, as this will determine many of the aspects of its geometry and design. You can get away with having extra polygons if your demo is going to be shown on a PC or laptop, but if you are targeting a mobile platform then every single vertex tends to be accounted for. Knowing how much budget is available and meant for assets is always useful. This way one to ensure you used all available vertices wisely and effectively.

The time spent making sure an asset is finished, optimised and ready to be incorporated into the game is time well spent, as it means little to no modification will be needed in the later stages of the production process. There is nothing worse than finding out the model’s poly count is too high and it needs to be reduced after having spent time weighting and animating it. In a case like this, you could reuse the animations; the model will need new weights, as the vertex number will be different after a reduction. And, even then, a reduction in vertices might result in the bones influencing the mesh differently, which could result in the animations being discarded too as the deforming mesh behaves strangely.

It is a good habit to spend a reasonable amount of time on a task and not rush it through. Rushing through one part of the process because the task seems to drag out and you’re itching to start something else is a very bad idea, as it tends to come back with a force later on. Below is an example of a good workflow for making optimised assets. The diagram defines each stage and allows clear progression from one part of the process to the next.

It is worth emphasising that whilst it is tempting to keep working on an asset to achieve the perfect balance between optimised mesh and high quality, there is a point where you should just declare it finished. You could consider an asset complete when it has a low poly count, the mesh is optimised for its purpose within the scene, has a good texture map and runs on the device whilst looking its best.


Figure 1 diagram.png

Fig. 1- example of workflow during asset production


Removing Transformations on a model:

Another point to emphasise is the cleanliness of the model. Cleanliness is a model that has no transformations applied to it and is at the origin of the scene. Any kind of transformation or residue (anything that will cause influence on the model, such as an animation keyframe) that remains on the model will have an effect on the animation, so it is essential for the asset to be clean and free from anything that could potentially influence the bones.

Before starting anything, freeze all transformations, delete the history of the scene, and make sure the model is where it should be and faces the correct direction. The reason behind this is to establish a neutral point to which you can always return during the animation process. The controllers used to move objects around a scene store the transformations in values of X, Y and Z. If one wants to return to the initial position at whatever point in the animation, it would make sense for that point to be 0, 0, and 0 instead of some arbitrary values that differ from controller to controller and would be difficult to track.


It is also worth pointing out that if one does not remember to freeze the transformations of a controller and bind it to a bone, the transformations of that controller will influence the bone and will most definitely make it move in ways, which are not desired.


Overall, zeroing out the transformations on the asset and on anything that is going to be applied to the asset is a good habit to keep, and one that most definitely pays off throughout the process.



Fig. 2- Mesh with transformations versus mesh without any transformations.
All the transformations applied to a mesh can be seen in the Channel Box menu.

Understanding Animation:

This is also a good point to introduce some terminology that might be used interchangeably throughout the text, in order to prevent any confusion:

  • When talking about the asset or model that is to be animated, one might refer to it as the ‘mesh’ or ‘skin’ as well as the terms used so far. 
  • ‘Rig’ and ‘skeleton’ are sister-terms, both refer to the hierarchy of bones and joints set up inside or around the object in order to animate it.
  • The bones are ‘bound’ to the skin, and will influence the mesh and deform it when moved. Skin weights or the action of ‘paint weighting’ allows control over that influence and enables the user to fix any incorrect deformations or influence that might occur.
  • Controllers are curves, or maybe other polygons, parented to the object or joint in order to make the animation process easier.

Moving the Mesh:

I hope these terms are clear and it is easier to understand some of the elements mentioned so far. Turning back to the clean mesh, at this point one should start considering how to proceed with the animation. Looking at the mesh construction tends to be a good starting point, as this might play a deciding factor. Not all meshes need a skeleton in order to be animated- skeletons and skinning can get expensive, so if the asset has the potential to be animated through a parented hierarchy it’s always better to do this. A character with detached limbs (think Rayman) or pieces of an engine that are moving in unison would be good examples of assets that would animate just fine with a parent hierarchy.


Here is an image of a very simple parent hierarchy set up in Maya:

Figure 3a.png

Figure 3a- Parent hierarchy example scene

Figure 3b.png

Fig. 3b- Parent hierarchy example set up


In the example shown in Figure 3a there are some simple shapes orbiting a cube. Each coloured controller controls the shapes individually, the black controller allows control over the small shapes, and the white controller moves both the small and big shapes. It is a simple set up, with it, one would be able to move the shapes, set the orbit, and with ease, even move the whole set up if they wanted.

The Rig:

On the other hand, humanoids, organic models or more complex assets do benefit from having a skeleton rig drive them. These rigs work in a similar enough way to how physical skeletons move a body. The bones are set up with IK handles, which create an effect close enough to muscles pulling on a joint to make it move. Rigs are easy to build and become familiar with, but can get complex very quickly, as shown in the example below:

fig 4.png

Fig. 4- Top-down view of the rig on a lizard


This rig contains about 95 bones and their respective controls, constrains (specific influences controllers cast on the joints) and weights. It works very smoothly, deforms nicely, allows good control over the lizard mesh, and performs well on a mobile platform. This rig was designed with complex movement in mind- it goes as far as having controls to allow the digits to contract and relax (Fig. 5)

fig 5.png

Fig. 5- Close up of finger control


Optimising a Rig:

This level of detail is fine if the camera is going to come quite close to the lizard and take a note of the finer movements, but might not be the ideal set up for something aimed at a mobile device or for a scene where the camera is not getting close enough to appreciate these movements. With this particular case, the asset happened to be the only animated one within the scene so there was enough budget to accommodate the amount of bones and influences, but what if that was not the case? Bones would need to be removed in order to accommodate for more animated characters. Using this example, the removal of the extra bones in hands and feet and a reduction of bones in the tail would remove around 65 bones, which is more than enough to animate another character and would reduce the bone count on the model by two thirds.

fig 6.png

Fig. 6- simple rig on a phoenix


Whilst the lizard is not an ideal candidate for a rig to drive an animation aimed for a mobile device, the rig on the phoenix is a much better example. In this case, the rig originally featured 15 bones, but an extra three were added to spread the influence caused in the lower part of the mesh, bringing the total count up to 18 bones. This particular model is also featured in a scene with other animated elements and plenty of effects, and was not meant to perform any particularly complex animation, so 18 bones is what it needs.


Always think carefully and take care when building the rig and controls that will drive your model. Make sure you understand what the animation is to achieve, and aim to build the rig in such a way that it can bring the object to life with the lowest quantity of bones as possible. As shown in fig. 7, a lot can be achieved with this particular rig.


Fig. 7- Progression of the animations of the phoenix, from complex flight to simpler, looping animation


The Animation Process:

So far, we have touched on the production of the assets, the rigging and skinning process and some optimisation practices, so it is time to address the actual animation process. Within computer animation, the process of animating tends to be carried out by positioning the model and keyframing the pose. The series of keyframes are then played one after another and bled together to form the resulting animation.


When animation movements are translated to the game engine, they can either be matched to triggers and played in response to them, or left to loop around on their own for as long as it’s needed. Simple looping animations are a very easy way to bring a scene to life without a complex animation, and if done right they can give the illusion of being one long string of constant movement.


ARM’s Ice Cave demo makes use of these types of animations to bring the cave to life. The butterfly and tiger both move around the cave in a constant loop, which were timed to fit with each other, and the phoenix constantly tries to come back to life but is always stopped by the loop of his animation taking him back to his sleeping, starting state.


Fig. 8- The Ice Cave Demo by ARM


Throughout the production of Ice Cave, we found that this was the best way to bring dynamism to the scene, as it would allow the video to loop continuously without restarting the demo when the animation stop.

I have repeated throughout this article that it is important to have a clear vision of what one is trying to achieve with their project, but that is because this knowledge makes many of the aspects of the production much smoother. More often than not, the result is good, optimised models, a well-constructed scene and cleverly tailored visual effects that, when put together, create the illusion that the overall product is of much higher specifications than what it actually is.


A breakdown of the scene, its elements, and their purpose will always help. Also, consider how the viewers will interact with the finished product: is everything going to reset after a certain point, or is it going to be playing continuously? Using these as a basic guideline, it will soon become clear what the best way to animate the objects is and how to best go on about it.

Animations in a Game Engine:

I hope that by this point it is evident that the asset creation and animation process is quite a complex process, full of elements to remember and consider at every point in the pipeline. The last step in the process is to export the animation and place it within the scene in your game engine of choice.


There are a few formats you can export your animated model to, but the most widely used are .fbx and .dae. Unity is able to handle maya’s .ma and .mb files as well, which can contain animations. The theory is simple enough, but in practice, a few things can go wrong resulting in the animation not exporting, or exporting in a wrong way.


3D model viewers are extremely useful when previewing the animations, as what might look fine in Maya might not be the same as what you get in Unity or other game engines. Assimp, Open3mod and Autodesk FBX converter are some examples of 3D viewers- FBX converter being particularly useful as it allows converting files from one format to another (fig. 9). This became very useful in situations in which animations would only export correctly in one file format but not the one that was needed. Even so, after running the model through some 3D viewers it’s always worth checking one last time within Unity or the game engine of choice. Unity allows the user to view the animations within the inspector tab (fig. 10) which will give an idea of how the animated model will look in the scene. It is worth noting that sometimes the mesh will deform awkwardly, and before panicking and thinking the animation exported wrongly, it is worth checking how many bones are influencing each vertex, as this might be the root of the problem.

fig 9.png

Fig. 9- Screenshot for the Autodesk FBX converter

fig 10.png

Fig. 10- Unity inspector tab displaying the animation of the phoenix


Taking an asset from start to finish is a very long process full of situations where many things can go wrong very easily, but understanding how each part of the process works and how best to go on about it will make it easier to handle. Throughout the course of this article, I have talked about 3D assets and how to progress from the initial sculpt to a fully animated asset integrated within the scene, with a focus on the animation part of the process. I hope that this has provided insight into the more artistic side of the 3D art world and solved any doubts, as well as provided useful advice for avoiding problems and keeping up good habits throughout the process.

Starship was formed to use our extensive experience developing software for games & simulations and apply it to market segments that hadn’t yet been exposed to the transformative power of digital technology.



One of the markets we quickly identified was cooking: people are obsessed with celebrity chefs, cooking shows and recipe books, but they haven’t really taken advantage of the latest software features when transferring across to the app space - most recipe apps are, at best, a glorified PDF, and cooking games are rendered in a cartoon style. We were sure we could do a lot better than that!


Our primary technical worry, though, was the steep “uncanny valley” drop off. Just like when looking at human faces, the brain has evolved to be able to spot fake/bad food a mile off. If we wanted realism, it wouldn’t be computationally cheap. On the plus side, our initial UX experiments immediately found the fun: on tablet devices where we can be tactile and the size format closes matches the pan & plates we wanted to represent.


CyberCook's objective then was to achieve a realistic 3D simulation of how food looks and behaves, all while running on tablet (and mobile) hardware at 30fps.





In general, food has pretty similar requirements to human skin to look realistic, which meant we could use the plentiful skin shading research as a reference. As we found, translucency, subsurface scattering, energy conserving BRDFs, IBL reflections, linear lighting and depth of field are all required to render believable food, while being quite a tall order for the mobile GPUs at the time of development.


A physically based solution would have been the ideal choice, but we couldn't afford it on mobile. we opted instead for a physically inspired solution, carefully testing which features made the most difference to the end results and letting go of the energy conserving requirement outside of the main BRDF.


The base intuition we took from our preliminary research on the task was that Depth of Field and Linear lighting are essential to the perception of skin and organic materials as realistic. The typical gamma falloff is ingrained in our mind, after a couple of decades of 3D games, and it screams "I'm fake".




Starship graphics programmer Claudia Doppioslash (doppioslash) had the tricky job of picking the right techniques that would enable the artists to create the assets they needed:


"Linear lighting is not available for mobile in Unity, so we had to implement it from scratch. While it's a simple matter of making sure all your textures and colours inputs are wrapped in a pow(<colour>, 2.2) and having a full screen effect that sets it back to gamma at the end, it's also fiddly, takes up computing power and it was confusing for the artists. At that time full screen effects were not supported on Unity's scene view, so they had to edit the scene in gamma, while taking care to preview their work in a game camera with the full screen effect on.


Depth of Field, while being an expensive full screen effect we paid for in performance, really helped the perception of the image as having been taken from a real camera belonging to a tv cooking show or a professional food photographer. Our artists researched extensively the look of food photography to apply it to CyberCook."




"The choice of BRDFs was at the heart of achieving realism. We know the Blinn-Phong look all too well and have associated it with video games. The moment you see it your brain is subconsciously reminded that you are looking at a game. It's especially bad for organic matter and it wasn't much good as a simulation of the food being coated in oil, either.


We relegated it to be used for non-organic, non-metallic materials in the kitchen environment. The main specular BRDF used for food, metal, and wood is an energy conserving Cook-Torrance with a GGX distribution. It can give the soft quality necessary to believe that something is organic and also the smooth one necessary for the metal objects, and is, all in all, a very flexible BRDF."




"We also used the anisotropic Ashikhmin-Shirley BRDF for the oil coating specular on the food and for the most important metal objects, such as the pan and the hob. The food oil coating was a ripe ground for experiments, as it's not a problem many games have. Ashikhmin-Shirley is expensive but the results are miles away from the alternatives.


Having different BRDFs made it hard to achieve IBL coherent with the lighting of the scene. We used the technique in the Black Ops 2014 Siggraph presentation [1], but it was meant to be used with Blinn-Phong as a distribution. Nevertheless it worked well enough for our requirements.


Last but not least, we used a number of diffuse BRDFs: Phong, of course, then Oren-Nayar was used for some vegetables which didn't look good enough with Phong. Our implementation of Subsurface Scattering follows Penner's pre-integrated skin shading 2011 Siggraph talk [2].


We were forced by the complexity of how prawns look in real life to implement a very approximated translucency effect inspired by the DICE 2011 GDC talk [3]."



[1] Getting More Physical in Call of Duty: Black Ops II

[2] Eric Penner's Pre-Integrated Skin Shading

[3] Approximating Translucency for a Fast, Cheap and Convincing Subsurface Scattering Look

Following the strategic partnership announcement at GDC 2015 which was held in San Francisco from 2nd to 6th March : “ARM and Tencent Games collaborate to advance mobile gaming”, we have worked with Tencent Games to provide Tencent’s mobile game developers with:

  • Access to the latest developers boards based on high-performance ARM Cortex@ CPUs and ARM MaliTM GPUs
  • Guidelines for developing mobile games on ARM-based solutions that address compatibility and performance challenges
  • Access to engineering resources from Tencent R&D Game studios and the ARM ecosystem

In order to better manage all the collaborations with Tencent Games, in the last five months we have worked with Tencent Games to build a joint innovation lab which locates Tencent Games Headquarter and ARM Shanghai office respectively. Now this lab has been opening for around six thousands game developers of Tencent Games.

205300694411383192-resize.jpg“ARM + Tencent Games Innovation Lab” at Tencent’s Office



“ARM + Tencent Games Innovation Lab” at ARM Shanghai Office


At the lab, we provide lots of lots of development devices based on high-performance ARM Cortex@ CPUs and ARM MaliTM GPUs, such as:

  • Xiaomi TV II 55’ powered by MStar DTV Chipset (Quad-core A17 CPU and Mali-T760MP4 GPU)
  • Xiaomi 4K TV Box powered by Amlogic chipset (Quad-core A9 CPU and Mali-450MP4 GPU)
  • Nagrace HPH Android TV Gaming Box powered by Rockchip RK3288 chipset (Quad-core A17 CPU and ARM Mali-T760MP4 GPU)
  • HTC Desire 626w Android phone powered by MediaTek MT6752 chipset (Octa-core A53 CPU and Mali-T760 GPU)
  • Samsung Galaxy S6 powered by Samsung Exynos 7 Octa – 7420 (Octa-core big.LITTLE  A57 and A53 CPU and Mali-T760MP8 GPU)
  • Samsung Gear VR and Galaxy Note 4 powered by Samsung Exynos 7 Octa – 7410 (Octa-core big.LITTLE A57 and A53 CPU and Mali-T760MP6 GPU)
  • 64bit Android Smart TV powered by HiSilicon DTV Chipset (Quad-core A53 CPU and Mali-450MP6 GPU)
  • And other devices powered by ARM CPU and Mali GPU



In addition to provide these devices, we also pre-installed the great demos created by Mali ecosystem engineering team and ecosystem partners, including :

  • Ice Cave – Built on Unity 5 and using Global illumination powered by Enlighten engine with lots of advanced features, such as soft shadows and so on.


  • Moon Temple – Built on Unreal Engine 4 with ARM 64bit enabled and Mali specific features, such as ASTC(Adaptive Scalable Texture Compression) and PLS (Pixel Load Storage)

UE demo-resize.jpg

  • Cyber Cook – A big fun Mobile VR game from Starship for Samsung Gear VR


  • 格斗江湖 TV Game from Tencent Games


  • And other demos to showcase how to leverage ARM technologies to optimize the games


By studying these demos, the game developers will be easier to understand what benefits to their games by leverage ARM technologies, and have a try in their games; finally we expect the game developers of Tencent Games can develop the great games which have better compatibility, higher performance, greater visual effects and better power efficiency.


Under this joint lab, we also work with Tencent Games to organize the regular workshops to provide the face to face communication between Tencent Game developers and ARM ecosystem engineers. For example, we co-worked with Unity and Tencent Games to organize the VR workshops at Tencent Shanghai and Shenzhen offices recently, which were very successful!  See the pictures as below, you will know what I mean


510091093713461396-resize.jpgARM VR DAY at Tencent Office

We are pleased to announce the release of Mali Graphics Debugger 3.0, which focuses on the user experience and makes the most out of all the work that has been done in the last two years. This release has required a great engineering effort, which started a year ago, during which time we have also added OpenGL ES 3.1 and Android Extension Pack support, ARMv8 64-bit targets, live shader editing, and support for all the new released versions of the Android system.

Version 3.0 adds support for multi-context applications and the capability of tracing multiple processes on an Android and Linux system. We have also changed our underlying GUI framework and added a few new UI features to most views.


Read the release announcement on


Mali Graphics Debugger 3.0 small.png

This coming Sunday I am excited to be chairing "Moving Mobile Graphics" at SIGGRAPH in sunny downtown Los Angeles. The half-day course will provide a technical introduction to mobile graphics, with the twist that the talk content will span the hardware-software spectrum and discuss the state of the art with practitioners at the forefront. I hope the range of perspectives will give attendee a good feel for how the whole mobile stack hangs together, and also illustrate the trends and future directions that are being explored.


SIGGRAPH page: Moving Mobile Graphics | SIGGRAPH 2015

Course home page:


In order to cover the spectrum, the speaker list is a cross-section of the industry: Andy Gruber, Graphics Architect of the Qualcomm Adreno GPU will be discussing some of the things mobile GPUs do in order to operate in such low power conditions, including discussing the Adreno geometry flow. Andrew Garrard, Samsung R&D UK will be discussing how mobile GPU architectures affect the software and software APIs we build on. Marius Bjorge will be presenting some recent API and algorithmic innovations that can dramatically reduce bandwidth on mobile, including on-chip rendering techniques such as Pixel Local Storage and ways to construct highly efficient blur chains for post-processing effects such as bloom.


To represent the state of the art in games development we have three speakers from different areas of the industry. Simon Benge from Exient, developers of Angry Birds Go! and Angry Birds Transformers, discussing how they squeezed the limits of mobile while keeping a broad free-to-play install base. Niklas Nummelin of EA Frostbite will discuss how the AAA graphics engine Frostbite is being brought to mobile, and finally Renaldas Zioma from Unity will discuss how the physically based shading present in Unity 5 was optimised for mobile.


More information on the course can be found on the event site above. As of Sunday this will include course slides and notes, so if you are unable to attend in person be sure to check back after Sunday!




Filter Blog

By date:
By tag:

More Like This