This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

what prevent parallel of non-fragment and fragment

We see, the non-fragment and fragment workloads are not parellel. 

"This expression defines the fragment queue utilization compared against the GPU active cycles. For GPU bound content it is expected that the GPU queues will process work in parallel, so the dominant queue should be close to 100% utilized. If no queue is dominant, but the GPU is close to 100% utilized, then there could be a serialization or dependency problem preventing better overlap across the queues."  -- This is the document on the official web site. But i can not get the specific meaning of it.

So can you give some specific events to prevent parallel between vs and ps?

And  what conditions are satisfied to parellel?

  • Hi Shawn, 

    First of all, if you are hitting your target framerate, e.g. 30 or 60 FPS, you may not be able to get perfect overlap. In that situation it really depends on the frame GPU load and the available GPU frequencies on that target device. 

    If you are not hitting 60 FPS, you're generally looking to avoid dependencies between render passes.

    In OpenGL ES the main causes of this are:

    • Fences or queries which are waited on too early, so CPU must block and wait for the GPU to resolve the fence/query.
    • Using glMapBuffer without GL_MAP_UNSYNCHRONIZED while the buffer is still referenced, so the CPU must block and wait for the GPU dependency to resovled (it might, it might also just ghost the resource).
    • Having a vertex shader consume the output of the previous render pass as an input texture or image. 
    • Having a compute shader consume the output of the previous render pass as an input texture or image.

    In Vulkan:

    • Same problems with fences and queries.
    • Having conservative resource dependencies, for example if the srcStage for generating a resource is "BOTTOM_OF_PIPE" and dstStage for consuming it is "TOP_OF_PIPE" then the workloads will serialize, a "TOP" (vertex shading) must wait for "BOTTOM" (fragment shading) to complete before it can start.  

    This might help provide some more background: developer.arm.com/.../workload-pipelining

    Kind regards, 
    Pete