timhar01 and I have just finished presenting a workshop about GPU Compute on mobile devices at the ARM Techcon Developer Summit in Santa Clara, California.
The workshop was 3 hours long and was a rapid tour of the current GPU Compute landscape with a focus on mobile and ARM® Mali™ GPUs in particular.
Although fairly mature on the desktop, GPU Compute is relatively new to the mobile space. The Mali-T600 series of GPUs were the first GPUs by ARM to introduce GPU Compute. The presentation started by giving an overview the current landscape of GPU Compute on mobile, some of the use cases and the available APIs.
The presentation then focused on the details of two of the APIs that the Mali-T600 series supports. We looked at OpenCL™ and RenderScript APIs in particular; how they work and how to use them. These APIs are hardware abstraction layers: they are generic for all hardware; however there are some implementation-specific parts, especially when it comes to optimisation. Because of this I first presented the generic version and then moved on to how the APIs map to the underlying Mali hardware.
After we had established the basics of OpenCL and the Mali hardware Tim took over to get into the details of optimisation. He presented some top tips for writing high performance OpenCL code for Mali GPUs. To sum up all of these techniques and to go through an OpenCL optimisation process, he then went through a small case study. We started with a naïve version of the Laplace image filter and went through various iterations applying some of the tips and techniques presented earlier while looking at the performance numbers.