This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Note 5 - still struggling for performance (GLES2.0, Java app)

For the best part of a year, on and off, I've been trying without success to get some semblance of decent performance out of some of our Android test devices.

The Note 5 has been particularly reluctant to give up the goods. I've multithreaded our engine, such that one thread does nothing but translate pre-compiled render packets into GL calls. The other thread updates the game and generates the packets, and completes well within the desired 16ms deadline.

On an iPod6 (single-threaded), framerate is nailed to 60, and the GPU time is measured at 8ms.

On the Note 5, the exact same sequence of GL calls exceeds 16ms, and fluctuates wildly. Attempting to profile it gives results like this:

Each green chunk is one frame from a static, unvarying scene. Notice how sometimes it can take two or three times as long to dispatch the exact same GL calls. Meanwhile, a hardware monitor tells me the GPU is barely ticking over, at base clock speed and 50% or lower load, and the CPU also rarely throttles anywhere near maximum.

It's almost as if the phone isn't really trying, but I can't find any clue as to what's actually going on. Help!

Parents
  • I can't help explain why the CPU frequency isn't ramping up under load - normally if the device is busy the frequency should increase unless the device has hit some physical limit such as a temperature threshold. This isn't under our control - all of the CPU and GPU frequency control is provided by Samsung in this case, so I can't really help on this aspect.

    In terms of a high baseline CPU processing cost, how many draw calls per frame are you making? Draw calls can be expensive on Mali, especially on older devices running older drivers, so we generally recommend keeping total draw call count under 500 draws a frame. Without knowing your application in more detail it's hard to provide specific advice - there are many things which can cause high CPU load such as bulk data upload, or resource copies due to drivers creating ghosts to avoid pipeline drains. Resource ghosting in particular is sensitive to the relative pipeline latency of the frames being built on the CPU and frames completing on the GPU, so that could explain why some frames are slower than others (e.g. some frames trigger a ghost being built, some don't).

    Is there any way you can share a Mali Graphics Debugger capture of a typical frame API sequence, or a reproducer for your application?

Reply
  • I can't help explain why the CPU frequency isn't ramping up under load - normally if the device is busy the frequency should increase unless the device has hit some physical limit such as a temperature threshold. This isn't under our control - all of the CPU and GPU frequency control is provided by Samsung in this case, so I can't really help on this aspect.

    In terms of a high baseline CPU processing cost, how many draw calls per frame are you making? Draw calls can be expensive on Mali, especially on older devices running older drivers, so we generally recommend keeping total draw call count under 500 draws a frame. Without knowing your application in more detail it's hard to provide specific advice - there are many things which can cause high CPU load such as bulk data upload, or resource copies due to drivers creating ghosts to avoid pipeline drains. Resource ghosting in particular is sensitive to the relative pipeline latency of the frames being built on the CPU and frames completing on the GPU, so that could explain why some frames are slower than others (e.g. some frames trigger a ghost being built, some don't).

    Is there any way you can share a Mali Graphics Debugger capture of a typical frame API sequence, or a reproducer for your application?

Children