This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Unused varyings optimisation

I've read this: https://community.arm.com/developer/tools-software/graphics/f/discussions/7975/programs-pipelines-performances-questions/30022#30022

Daniele di Donato said: "For example, it will remove a varying calculation in the vertex shader if the fragment shader doesn't declare to use it."

I'd like to know whether the device driver will remove the computation of some varyings which eventually don't contribute to the output of fragment shader, even it's declared in the varying struct ?

Because it's typical to write a uber shader with several features, and toggle each them on and off by preprocessor symbol. But this can become very messy and difficult to maintain. If we can safely rely upon the device driver to optimise them out, it will make the shader development easier. For example, can we safely remove remove #ifdef FEATURE_1/#endif from our code and rely exclusive the device driver to optimise the code out ?

In a real situation, there will be a lot of #ifdef FEATURE_X and they can even be nested.

struct Varying {
#ifdef FEATURE_1
    float2 uv2: TEXCOORD1;
#endif    
};

half4 frag(Varying input) {
    ...
#ifdef FEATURE_1
    color += uv2;
#endif
    ...
	return color;
}

Parents
  •  

    you ALSO need to specialize your buffers to remove unused attributes to ensure you get the bandwidth savings on input vertex data

    Simply using interleaved VBO should be enough to prevent unnecessary fetch of the unused attributes, right ?

    like any optimization you are somewhat at the mercy of the compiler so the best option is to specialize the shader if you can as that guarantees the behavior you want.

    I assume that I should not rely upon the compiler or the device driver to "pack" the varyings, right ?

    Does the compiler make any attempt at all to pack the varyings ? Or it leaves packing entire to the users ?

    This packing can become very complicated for an uber-shader, with a lot of conditional compilation directives (#ifdef).

    struct Varying1 {
        float2 uv1: TEXCOORD0;
        float2 uv2: TEXCOORD1;
    };
    
    struct Varying2 {
        float4 uv1_uv2: TEXCOORD0;
    };

Reply
  •  

    you ALSO need to specialize your buffers to remove unused attributes to ensure you get the bandwidth savings on input vertex data

    Simply using interleaved VBO should be enough to prevent unnecessary fetch of the unused attributes, right ?

    like any optimization you are somewhat at the mercy of the compiler so the best option is to specialize the shader if you can as that guarantees the behavior you want.

    I assume that I should not rely upon the compiler or the device driver to "pack" the varyings, right ?

    Does the compiler make any attempt at all to pack the varyings ? Or it leaves packing entire to the users ?

    This packing can become very complicated for an uber-shader, with a lot of conditional compilation directives (#ifdef).

    struct Varying1 {
        float2 uv1: TEXCOORD0;
        float2 uv2: TEXCOORD1;
    };
    
    struct Varying2 {
        float4 uv1_uv2: TEXCOORD0;
    };

Children