This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Draw call performance on the Mali-T880 MP12

I've been profiling a 3D scene on the Samsung Galaxy S7 and I've noticed that glDrawElements and glDrawArrays CPU time is a lot larger compared to Adreno and PowerVR GPUs.

For some context, in an effort to improve performance on Mali devices, I moved all the OpenGL calls to a separate render thread. After that change, the render thread now is bottle-necking the entire application at a ~50-60ms frame time in a scene with 335 draw calls (after letting the device sit for 5 minutes to thermal throttle).

While I would normally excuse this as being GPU-bound, I ran a DS-5 capture on the device and noticed that the GPU's vertex and fragment time was taking a lot less than this (around ~30ms when the device throttles).

Is there any explanation for why the GL calls are taking so long while the GPU isn't 100%? It looks like every GL call is more expensive on Mali, for some reason.

Here's an attached picture of our DS-5 capture, with the render thread isolated on the CPU Activity

In addition, the Unreal Engine (in the mobile optimization guidelines) recommends scenes to be <= 700 draw calls. While I'm not using the Unreal Engine, is this nevertheless a realistic target for this GPU?

Parents
  • Hi

    Can you provide a systrace when the device is throttled down? I had a look at the traces you sent and I noticed a bit of serialization between main thread and render thread. Specifically, I see the render thread waits for the main thread to start, are game and render thread sharing something that needs synchronization? When the device throttles down, I would expect the main thread will also take longer to complete and the sync point make the overall performance to suffer.

    If it's possible, It would be good to have a test apk to understand if there is a specific reason why you are seeing this high CPU usage.

    -DDD

Reply
  • Hi

    Can you provide a systrace when the device is throttled down? I had a look at the traces you sent and I noticed a bit of serialization between main thread and render thread. Specifically, I see the render thread waits for the main thread to start, are game and render thread sharing something that needs synchronization? When the device throttles down, I would expect the main thread will also take longer to complete and the sync point make the overall performance to suffer.

    If it's possible, It would be good to have a test apk to understand if there is a specific reason why you are seeing this high CPU usage.

    -DDD

Children