This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Shader data path utilization counters

Hi, i'm using streamline to profiler our game's gpu performance. My device is Mali-G77MC9, so i followed this guide: mali-g77-counters.
In "Shader data path utilization" section, i found the "Fragment FPK buffer active percentage" expression gives a high value

Fragment FPK buffer active percentage: 92%

This expression defines the percentage of cycles where the forward pixel kill (FPK) quad buffer, before the execution core, contains at least one quad.
I assume we should keep this percentage as low as possible? Because it indicates we took a lot of shader core power to compute useless Quads. Is this correct?

  1. min(($MaliCoreCyclesFragmentFPKBActive / $MaliCoreCyclesFragmentActive) * 100, 100)
  2. min((24.3/26.3)*100, 100)=92%

Execution core utilization:98%

  • min(($MaliCoreCyclesExecutionCoreActive / $MaliGPUCyclesGPUActive) * 100, 100)
  • min((29.6/30.7)*100,100)=98%

Execution core utilization looks good, because we did have a lot of shader instructions to run.

But i don't understand why Fragment FPK buffer active percentage is so high(92%).

Dose this mean our content takes too much time in FPK Buffer section? Because FPK kills a lot of Quads?

And geometry culling expressions did show there's a lot of geometries are killed by hardware culling unit.

Total input primitives:923K

Total culled primitives:620K

Visible primitives after culling:33%

Facing or XY plane test cull rate:42%

Z plane test cull rate:16%

Sample test cull rate:32%

Does this mean i should do something to lower geometry culling rate, is this related to "Fragment FPK buffer active percentage"?

What should i do to lower "Fragment FPK buffer active percentage"?

  • I assume we should keep this percentage as low as possible?

    No, you want it as high as possible (keep the buffer filled, so there is forward pressure on the shader core and more scope for hidden surface removal). 

    nd geometry culling expressions did show there's a lot of geometries are killed by hardware culling unit.

    You expect to get ~50% visible (half in frustum, but killed by back-face culling). 

    The Z-plane test cull rate indicates that 16% of primitives are the wrong-side of the near/far clip plane - check your CPU-side culling to make sure you are killing as much as you can. 

    The sample test cull rate (32%) indicates that you have a lot of very dense meshes. 32% of your front-facing in-frustum primitives are being killed because they are so small they hit no sample points. Simplify your meshes and use dynamic level-of-detail based on camera distance. Aim to keep triangles above ~10-15 pixels in size.

    HTH, 

    Pete

  • Thank you for your detailed answer.