This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Programs pipelines performances questions

When reading the following slide from ARM Getting the most out of OpenGL ES 3.0, one of the first feature presented is Separate Shader Objects, which involve using the new Program Pipeline feature of OpenGL ES 3.1. However, when grep-ing the Mali's Android SDK code for pipeline, the only results I got was some references in HTML documentation files and comments.

The question is :

  • Does the program pipeline have serious performances cost compared to simple programs blocks used with glUseProgram ?

The Apple documentation seems to encourage the use of pipelines in their Best Practices for Shaders documentation, however they target other GPU so I'm wondering if SSOs and pipelines play well compared to monolithic programs on Mali GPU specifically ?

Parents
  • Hi ,

    Using the classic way to create a program allows the driver to perform more optimizations for the whole program. The driver will first optimize each shader separately and then when linking it will optimize the program as a whole. In this phase the compiler will execute link-time optimizations. For example, it will remove a varying calculation in the vertex shader if the fragment shader doesn't declare to use it.

    When using SSOs the single shaders get optimized and stored in a way that is much faster to link at runtime but doesn't allow for whole-program optimizations (which will take time).

    So in short there could be some optimizations that are not applied when you use SSOs. The impact of the optimization is entirely dependent on your shaders code so it cannot be quantified easily. For example, if your vertex shaders outputs 6 varyings but the fragment shader you are going to use it with is only reading 1 varying, the vertex shader will still calculate the value of those other 5 varying for each vertex. If the calculation is complex, these can impact the performance if the application is vertex bound.

    Regards,

    Daniele

Reply
  • Hi ,

    Using the classic way to create a program allows the driver to perform more optimizations for the whole program. The driver will first optimize each shader separately and then when linking it will optimize the program as a whole. In this phase the compiler will execute link-time optimizations. For example, it will remove a varying calculation in the vertex shader if the fragment shader doesn't declare to use it.

    When using SSOs the single shaders get optimized and stored in a way that is much faster to link at runtime but doesn't allow for whole-program optimizations (which will take time).

    So in short there could be some optimizations that are not applied when you use SSOs. The impact of the optimization is entirely dependent on your shaders code so it cannot be quantified easily. For example, if your vertex shaders outputs 6 varyings but the fragment shader you are going to use it with is only reading 1 varying, the vertex shader will still calculate the value of those other 5 varying for each vertex. If the calculation is complex, these can impact the performance if the application is vertex bound.

    Regards,

    Daniele

Children