Hello, recently I am studying mali gpu counters and encountered some questions about the counter "primitive loaded" and "primitives dropped". I refer to the counter explaination here https://community.arm.com/docs/DOC-10182#jive_content_id_321_COMPUTE_TASKS .
It says the counter "Primitives loaded" increments for every primitive read from the tile list, but not all of these triangles will necessarily be visible in the current tile, due to the use of the hierarchical tiler. What does the counter mean? the number of triangles in one frame or the number of triangles in a single tile? or maybe something else? Could you explain in detail?
Hi seufanghao,
That document explains what the hardware counters means, and is technically generic from a tool such as ARM DS-5 Streamline, which uses those counters and presents the information to the user.
Our hardware counters such as the "primitives" ones are on a per tile basis. Streamline aggregates all of these across the render targets, and aggregates those again over time.
This is because Streamline is a time based profiler, not a frame, render target, or tile based profiler.
The number you see is the number of 'primitives' for that time slice selected, across all tiles/cores/render targets/frames that occurred during that time slice.
I hope that helps explain things a little better.
Kind Regards,
Michael McGeagh
Hi mcgeagh,
Thanks for your explaination. I did a test which draw 10 triangles on the screen. The counter "tiler triangles" is 10, exactly the same as what I mentioned before. The primitive is triangles here. However, the counter "primitives loaded" is not 10, it is ~3000(resolution, 1280*720). I have no idea why the primitives is not 10. What is the difference between "tiler triangles" and "primitives loaded"? Then how to calculate the number of primitives loaded?
Mali is a tile-based renderer (see The Mali GPU: An Abstract Machine, Part 2 - Tile-based Rendering) which means that we break up the fragment rendering into a series of 16x16 pixel tiles.
We have to read the primitive data for every tile, so if you had one large triangle which covered a 720p screen you would get one triangle seen by the tiler (this is the application triangle count), and 3600 triangles read by the fragment shading (one triangle, read once per tile).
The exact count doesn't matter too much - don't try to compute it - but do watch out for the ratio of the number of primitives loaded versus the number of fragment threads created. This gives some means to compute an average triangle size, from the point of view of the fragment shading operation. In general you want to keep the number of fragments per loaded primitive relatively high (rule of thumb - higher than 10) - triangles are expensive so keep them relatively large.
HTH, Pete