Hi Shawn, Older Mali GPUs (Utgard, Midgard) render vertices in incrementing index order, between min and max referenced index, ignoring the actual order in the index buffer.
Newer Mali GPUs (Bifrost) render vertices in the order in the index buffer (in contiguous blocks of 4), and can skip ranges that are not referenced (except those that are in the same block of 4 as a referenced vertex).
Using mesh conditioning tools such as meshoptimizer are a very good idea; the algorithms applied will work well for both types of Mali GPU, either to minimize bandwidth or reduce shading cost.
HTH, Pete
Peter Harris said:Older Mali GPUs (Utgard, Midgard) render vertices in incrementing index order, between min and max referenced index, ignoring the actual order in the index buffer.
Thank you very much for your reply, but this reply gave me new questions. For example, there are vertices 0, 1, 2, 3, 4, 5, 6, and 7 in the vertex buffer. In actual drawing, the index buffer only uses three vertices 0, 5, and 7. But the GPU side will also execute the calculation of the five vertices 1, 2, 3, 4, and 6?
And how about the order of fragments?
Mali is tile-based, so effectively fragment order is scrambled - it doesn't correspond to the draw sequence in any way (except within a pixel, where layering must respect blending, of course).
Shawn Chang said:In actual drawing, the index buffer only uses three vertices 0, 5, and 7. But the GPU side will also execute the calculation of the five vertices 1, 2, 3, 4, and 6?
Yes. Don't leave unused indices in the active index range, aim to reference every vertex between min and max.
For implementing mesh level-of-detail, duplicate vertices and have a compacted index range for each LOD, don't make low resolution meshes by sparsely sample from the LOD 0 mesh.
In relatively new GPUs, calculations are done in the order of indexbuff. The vertices shared by multiple triangles may be calculated multiple times. What about these old GPUs you mentioned? Will there be multiple calculations?For example, drawing two triangles with 0,1,2 and 2,1,3 in indexbuff. When the GPU is calculating, will the vertices of the two indexes of 1 and 2 be executed twice?
In other words, older Mali GPUs (Utgard, Midgard) simply render all vertices once in incrementing index order, between min and max referenced index?
In relatively new GPUs, calculations are done in the order of indexbuff. The vertices shared by multiple triangles may be calculated multiple times.
Yes, may be shaded multiple times but there is post-transform caching, so it will only happen if you have bad locality in your index buffers. In general we don't see much reshading for sensibly ordered meshes.
What about these old GPUs you mentioned? Will there be multiple calculations? In other words, o lder Mali GPUs (Utgard, Midgard) simply render all vertices once in incrementing index order, between min and max referenced index?
It's guaranteed not to reshade, but may shade non-referenced vertices between min and max (various optimizations exists to minimize this, but YMMV). It's therefore recommended to tightly pack used indices - no unreferenced vertex between min and max.