This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

# Processing order of vertices and fragments in arm gpu

When the ARM GPU processes vertices and fragments, in what order are they distributed and processed? Vertices are in index order? What about fragment? Can we try to control the order of the index on the cpu side to improve the efficiency of the GPU?
For example: Does the optimization of the link below improve the gpu of arm? If there is an improvement, what features of the GPUs they use? github.com/.../meshoptimizer
• Hi Shawn,

Older Mali GPUs (Utgard, Midgard) render vertices in incrementing index order, between min and max referenced index, ignoring the actual order in the index buffer.

Newer Mali GPUs (Bifrost) render vertices in the order in the index buffer (in contiguous blocks of 4), and can skip ranges that are not referenced (except those that are in the same block of 4 as a referenced vertex).

Using mesh conditioning tools such as meshoptimizer are a very good idea; the algorithms applied will work well for both types of Mali GPU, either to minimize bandwidth or reduce shading cost.

HTH,
Pete

• Older Mali GPUs (Utgard, Midgard) render vertices in incrementing index order, between min and max referenced index, ignoring the actual order in the index buffer.

Thank you very much for your reply, but this reply gave me new questions. For example, there are vertices 0, 1, 2, 3, 4, 5, 6, and 7 in the vertex buffer. In actual drawing, the index buffer only uses three vertices 0, 5, and 7. But the GPU side will also execute the calculation of the five vertices 1, 2, 3, 4, and 6?

• In actual drawing, the index buffer only uses three vertices 0, 5, and 7. But the GPU side will also execute the calculation of the five vertices 1, 2, 3, 4, and 6?

Yes. Don't leave unused indices in the active index range, aim to reference every vertex between min and max.

For implementing mesh level-of-detail, duplicate vertices and have a compacted index range for each LOD, don't make low resolution meshes by sparsely sample from the LOD 0 mesh.

• Older Mali GPUs (Utgard, Midgard) render vertices in incrementing index order, between min and max referenced index, ignoring the actual order in the index buffer.

In relatively new GPUs, calculations are done in the order of indexbuff. The vertices shared by multiple triangles may be calculated multiple times. What about these old GPUs you mentioned? Will there be multiple calculations?
For example, drawing two triangles with 0,1,2 and 2,1,3 in indexbuff. When the GPU is calculating, will the vertices of the two indexes of 1 and 2 be executed twice?

In other words, older Mali GPUs (Utgard, Midgard) simply render all vertices once in incrementing index order, between min and max referenced index?