We are running a survey to help us improve the experience for all of our members. If you see the survey appear, please take the time to tell us about your experience if you can.
I've been profiling a 3D scene on the Samsung Galaxy S7 and I've noticed that glDrawElements and glDrawArrays CPU time is a lot larger compared to Adreno and PowerVR GPUs.
For some context, in an effort to improve performance on Mali devices, I moved all the OpenGL calls to a separate render thread. After that change, the render thread now is bottle-necking the entire application at a ~50-60ms frame time in a scene with 335 draw calls (after letting the device sit for 5 minutes to thermal throttle).
While I would normally excuse this as being GPU-bound, I ran a DS-5 capture on the device and noticed that the GPU's vertex and fragment time was taking a lot less than this (around ~30ms when the device throttles).
Is there any explanation for why the GL calls are taking so long while the GPU isn't 100%? It looks like every GL call is more expensive on Mali, for some reason.
Here's an attached picture of our DS-5 capture, with the render thread isolated on the CPU Activity
In addition, the Unreal Engine (in the mobile optimization guidelines) recommends scenes to be <= 700 draw calls. While I'm not using the Unreal Engine, is this nevertheless a realistic target for this GPU?
Hitting 60ms a frame with only 335 draw calls sounds much lower than we expect; we see plenty of applications with 500+ draws hitting 60FPS, so you're about 1/5th of what we would consider "normal".
Can you check what your CPU frequency is, and which CPU type you are running on? If I had to hazard a guess it sounds like your application is locked to a "LITTLE" CPU running at a relatively low frequency, rather than migrating across to the big CPU which has higher software performance.