This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Processing order of vertices and fragments in arm gpu

When the ARM GPU processes vertices and fragments, in what order are they distributed and processed? Vertices are in index order? What about fragment? Can we try to control the order of the index on the cpu side to improve the efficiency of the GPU?
For example: Does the optimization of the link below improve the gpu of arm? If there is an improvement, what features of the GPUs they use? github.com/.../meshoptimizer
Parents
  • Hi Shawn, 

    Older Mali GPUs (Utgard, Midgard) render vertices in incrementing index order, between min and max referenced index, ignoring the actual order in the index buffer. 

    Newer Mali GPUs (Bifrost) render vertices in the order in the index buffer (in contiguous blocks of 4), and can skip ranges that are not referenced (except those that are in the same block of 4 as a referenced vertex).  

    Using mesh conditioning tools such as meshoptimizer are a very good idea; the algorithms applied will work well for both types of Mali GPU, either to minimize bandwidth or reduce shading cost.

    HTH, 
    Pete

Reply
  • Hi Shawn, 

    Older Mali GPUs (Utgard, Midgard) render vertices in incrementing index order, between min and max referenced index, ignoring the actual order in the index buffer. 

    Newer Mali GPUs (Bifrost) render vertices in the order in the index buffer (in contiguous blocks of 4), and can skip ranges that are not referenced (except those that are in the same block of 4 as a referenced vertex).  

    Using mesh conditioning tools such as meshoptimizer are a very good idea; the algorithms applied will work well for both types of Mali GPU, either to minimize bandwidth or reduce shading cost.

    HTH, 
    Pete

Children
  • Older Mali GPUs (Utgard, Midgard) render vertices in incrementing index order, between min and max referenced index, ignoring the actual order in the index buffer. 

    Thank you very much for your reply, but this reply gave me new questions. For example, there are vertices 0, 1, 2, 3, 4, 5, 6, and 7 in the vertex buffer. In actual drawing, the index buffer only uses three vertices 0, 5, and 7. But the GPU side will also execute the calculation of the five vertices 1, 2, 3, 4, and 6?

  • And how about the order of fragments?

  • Mali is tile-based, so effectively fragment order is scrambled - it doesn't correspond to the draw sequence in any way (except within a pixel, where layering must respect blending, of course).

    HTH, 
    Pete

  • In actual drawing, the index buffer only uses three vertices 0, 5, and 7. But the GPU side will also execute the calculation of the five vertices 1, 2, 3, 4, and 6?

    Yes. Don't leave unused indices in the active index range, aim to reference every vertex between min and max. 

    For implementing mesh level-of-detail, duplicate vertices and have a compacted index range for each LOD, don't make low resolution meshes by sparsely sample from the LOD 0 mesh.

  • Older Mali GPUs (Utgard, Midgard) render vertices in incrementing index order, between min and max referenced index, ignoring the actual order in the index buffer. 

    In relatively new GPUs, calculations are done in the order of indexbuff. The vertices shared by multiple triangles may be calculated multiple times. What about these old GPUs you mentioned? Will there be multiple calculations?
    For example, drawing two triangles with 0,1,2 and 2,1,3 in indexbuff. When the GPU is calculating, will the vertices of the two indexes of 1 and 2 be executed twice?

    In other words, older Mali GPUs (Utgard, Midgard) simply render all vertices once in incrementing index order, between min and max referenced index?

  • In relatively new GPUs, calculations are done in the order of indexbuff. The vertices shared by multiple triangles may be calculated multiple times.

    Yes, may be shaded multiple times but there is post-transform caching, so it will only happen if you have bad locality in your index buffers. In general we don't see much reshading for sensibly ordered meshes.

    What about these old GPUs you mentioned? Will there be multiple calculations? In other words, o lder Mali GPUs (Utgard, Midgard) simply render all vertices once in incrementing index order, between min and max referenced index?

    It's guaranteed not to reshade, but may shade non-referenced vertices between min and max (various optimizations exists to minimize this, but YMMV). It's therefore recommended to tightly pack used indices - no unreferenced vertex between min and max.

    HTH, 
    Pete