This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

How efficient is fragment discarding for large amounts of triangle overlap?

I have a question regarding the efficiency of discarding fragments for scenes that are excellently sorted, but have many overlapping triangles.

If you have a large number of large triangles (eg. 1000) that are perfectly front-to-back sorted but overlap (or occlude) one another, roughly how much hardware performance penalty will this incur? I understand that the sorting will result in zero overdraw and thus fewer fragment's being processed by the fragment shader, but in this case will Mali be able to discard fragments that are not seen at a very low performance cost?

If there is a significant performance cost, what is a rule-of-thumb maximum amount of triangle overlap before performance starts to degrade?

Sean

Parents
  • No so much "in the background" as just deeply pipelined.

    We load triangles at one end of the pipeline, rasterize them to fragments, issue fragments to get colored, run the shader program, and finally colored tiles drop out the other end of the pipeline. All of these stages run in parallel, so if one bit stalls for a while doing "redundant" work that's generally not too much of an issue unless:

    1. it stalls for so long that the dominant unit goes idle, or
    2. the unit doing the redundant work is the dominant unit ...

    In well written content the "run the shader program" part of the pipeline is the dominant part, so the other bits processing some redundant work is not the end of the world (in reality it will have some small knock-on effect, due to shared resources, such as cache or memory bandwidth, but it is normally minor).

    > is it valid to assume that Mali's tile based rendering should do well at minimizing external memory writes?


    Yes, all of the color / depth / stencil framebuffer state will remain inside the tile until writeout at the end of the tile, so any blending, depth testing, or MSAA is "free". With suitable use of glInvalidateFramebuffer (or discard extension in GLES 2.0) the transient state (which depth and stencil often are) need never hit main memory at all.


    HTH,

    Pete

Reply
  • No so much "in the background" as just deeply pipelined.

    We load triangles at one end of the pipeline, rasterize them to fragments, issue fragments to get colored, run the shader program, and finally colored tiles drop out the other end of the pipeline. All of these stages run in parallel, so if one bit stalls for a while doing "redundant" work that's generally not too much of an issue unless:

    1. it stalls for so long that the dominant unit goes idle, or
    2. the unit doing the redundant work is the dominant unit ...

    In well written content the "run the shader program" part of the pipeline is the dominant part, so the other bits processing some redundant work is not the end of the world (in reality it will have some small knock-on effect, due to shared resources, such as cache or memory bandwidth, but it is normally minor).

    > is it valid to assume that Mali's tile based rendering should do well at minimizing external memory writes?


    Yes, all of the color / depth / stencil framebuffer state will remain inside the tile until writeout at the end of the tile, so any blending, depth testing, or MSAA is "free". With suitable use of glInvalidateFramebuffer (or discard extension in GLES 2.0) the transient state (which depth and stencil often are) need never hit main memory at all.


    HTH,

    Pete

Children