I was watching the Mali GPU training video 2-2 Best practices principles. At around 3:50 there is a cycles per pixel calculation presented.
It shows (ShaderCores * Frequency) / (TargetFPS / Pixels).
So my question is: why is there no warp size in this? Shouldn't it be more like (ShaderCores * WarpSize * Frequency) / (TargetFPS / Pixels)?
Of course the "derated" factor would then be smaller to counter "useless" (not contributing to final image) helper lanes. Could it be that this is just a theoretical difference for actual cycles per pixel and the warp size doesn't make that big of a difference in real world use? Or is there something I got wrong in my base assumption to include the warp size in the calculation?
Loving the Mali GPU training videos so far! :)
Hi Laurin, Glad you are enjoying the training.
The budgeting method presented in the course is a very simple one which aims to get to an initial "shader core cycles available per pixel" estimate. How much work can actually be completed per cycle will depend on which Mali GPU you are running - different Mali designs can have quite different performance per core. Per core metrics can be found on the data sheet linked off here:
At this level of detail warp size doesn't matter - you can just scale the shader core cycle cost by the e.g. total arithmetic capacity of the shader core to get a budget for "arithmetic per pixel". In reality it would take longer to execute (N times longer, but as N threads are running in parallel in the warp to average throughput is unaffected).
Laurin Agostini said:Of course the "derated" factor would then be smaller to counter "useless" (not contributing to final image) helper lanes.
Helper lanes are definitely one of the reasons for derating here, but not the only one.