Another beautiful day in San Francisco and it’s back to Moscone West for more awesome expertise from across the game dev industry. Yesterday afternoon we joined Epic and Oculus to hear how the collaboration between Unreal Engine, Oculus and ARM can support developers in creating super high quality VR content. As they said right off the bat, mobile is a much more challenging platform than console or PC so it’s really important developers are given all the tools possible to make life easier. They discussed the simple steps you can take yourself to make life easier, like avoiding dynamic content wherever you can in favour of prebaking to save power and processing bandwidth. It seems like common sense but another simple step to a smoother dev experience was to make sure you do your testing on the actual devices you intend to target. Not only that, but testing it for the amount of time you expect from end user gameplay will give you a much better idea of how your game will work out in the real world.
Other comparatively easy optimizations include avoiding rendering anything that can’t be seen, removing triangles that won’t be in shot, not rendering the backside of objects that will never be in view and avoiding wasted draw calls rendering anything that’s occluded by anything else. 4x MSAA is also enabled as standard in Mali for the GearVR and it’s basically free so you may as well use it. 8 x is available too and even that has a barely perceptible cost but can add hugely to your application.
The guys from Unreal explained about monoscopic rendering to reduce the overhead caused by VR’s need to have slightly different views rendered for each eye. Epic implemented a system whereby a stereoscopic camera takes care of the two eye views in the near distance, while a third camera handles anything more than around 30 feet from the user view. This is manageable because the human eye can’t detect depth accurately past a certain distance so the further away objects don’t need the subtle difference in view that closer objects do. To avoid the pitfalls of rendering occluded or unnecessary objects, you simply render the stereoscopic cameras first and use that to establish which areas of the far distance you can ignore. This process can get you up to 20% performance games in certain environments BUT, because it’s so dependent on the surroundings, there is also a potential for a decrease so you have to be careful, which is why Unreal have enabled the ability to switch it on and off between different scenes.
Next up, our in-house graphics gurus got together with Unai Landa, CTO of Digital Legends to provide their golden rules for achieving console quality games (like their Aftershock) on mobile. As we’ve heard a lot this week, thermal limits are our nemesis so it’s vital to keep this in mind in your development activities. We were shown how innovations in ARM Mali’s new Bifrost architecture, as well as big.LITTLE technology for task allocation, can give you a head start in greater efficiency right from the word go. Unai talked about Digital Legends’ motivations in optimizing their game, Aftershock:
Whilst a lighthearted approach, this is a serious sentiment, and there were 5 key principles to follow to maximise your chances of a console quality, highly performant game that won’t drain your battery in 5 seconds flat. To save me listing them for you here, the helpful folks at Develop Online have published the comprehensive guide.
Once the basics had been established it was super helpful to see the experience of Digital Legends in their real-life application process. They showed practical examples of occlusions culling, overdraw removal, data stream optimization and heaps more and it was much easier to visualise in this real-world setting.
Of course, optimizing your game isn’t always easy but once you’ve made it through the key principles, there’s a range of tools that can help a lot and we were taken through a workflow of which to try first to maximise your chances of ironing out any kinks. The first step is to analyse using DS-5 Streamline. This profiles both the CPU and GPU to establish exactly where your issues are. If it turns out they’re in the CPU then you’re all set, Streamline can take you right through the process. If it finds you’re GPU bound you can switch to the Mali Graphics Debugger (MGD) to show you exactly where your application is lagging and you can then move to the offline compiler to optimize your content.
My final session this morning took a much closer look at Vulkan to demonstrate just how many benefits it provides to the developer over OpenGl ES. It’s great to see that in just a year since the API’s release not only have numerous industry players released drivers and SDKs but Vulkan is now also integrated into the Unity engine. Two of the most important features for developers are its improved multithreading capabilities and Multipass, which is similar to ARM’s Local Pixel Storage and ideal for tiled GPU architectures like Mali because each pixel rendered on a subpass can access the results of the previous one. This means less data transfer to the main memory and therefore, less bandwidth use.
Not only is Vulkan integrated with Unity but now the Mali Graphics Debugger is too, giving you almost everything you need to create a great game, all in one engine.
The cofounder of Infinte Dreams took us through the optimizations they used in their Sky Force Reloaded game, an awesome shoot ‘em up with complex graphics, intense action and therefore a whole lot of pressure on the CPU and GPU. First of all, they realised they were struggling with efficiency and that any negative feedback they got on the game was generally related to the rate at which it drained the user’s battery. They initially considered fill rate but established this wasn’t an issue, so it quickly became apparent that the fact that they had up to 1000 draw calls per frame could be the cause of their problems as even top end devices would struggle with this.
This meant OpenGl ES wasn’t a great option for them so the fact that Vulkan in Unity works right out of the box made their lives a whole lot easier. They got to work testing frames and quickly realised that the gains were massive. Sky Force Reloaded worked up to 82% faster on Vulkan, and an average of 32% faster! There are of course heaps of other areas that can be optimized so the fact that such massive gains came from simply switching to Vulkan gives you some idea of just how valuable it is. The guys at Infinite Dreams were able to use this leeway to add in additional features and complexity to make their game even more impressive while retaining the same FPS and adding 10-12% more playtime!
After this we heard from Unity themselves who talked us through the process of implementing Vulkan and provided their own Pro Tips for getting the most from it:
Pro tip 1: GPUs really really hate switching between buffer bindings, whereas having a large buffer and simply changing offsets is almost free.
Pro Tip 2: Descriptor set objects may get consumed at bind time.
Pro tip 3: Mobile GPUs don’t do any magic on constant buffers! They’re just pointers to main RAM.
Pro tip 4: GPUs have caches for memory access, typically 64- or 32-byte cache lines. A larger constant buffer for “rarely accessed” parameters is a bad idea as it thrashes the cache.
Pro tip 5: Don’t bother with reusing secondary command buffers. On most GPUs reusing a secondary command buffer requires so much patching to the buffer that there’s very little benefit over rebuilding the whole command buffer anyway.
It was really nice to hear first-hand from the makers of one of the world’s biggest game engines and understand just why it was so important for them to be able to provide direct Vulkan access for their global base of developers.
So with this wealth of expertise at my fingertips my talk-time is over for another GDC and I’m already looking forward to next year! All is not yet over though, as I’m heading over to the show floor to bring you the best of the demos, games and cool stuff from us and all our partners.