gl_fragcord shader variable

rchakena 4 months ago

Hi Forum,

When I profile a simple shader as below with gl_FragCord built-in using streamline, I see about 4k FMA instructions and 12k CVT instructions, even though there is no math operation.

Appreciate any inputs on why these FMA, and and 3x more CVT instructions are getting generated and suggestions to reduce it.

#version 450

precision highp float;

precision highp int;

layout(location = 0) out vec2 out_color;

layout(location = 0) in vec4 texcoord;

void main(){ out_color = vec2(gl_FragCoord.x, texcoord.x);}

Top replies

Parents

0 Peter Harris 4 months ago

Looks like some shader compiler inefficiency around moves - not really much you can do to avoid it at the source level as far as I can tell, but this looks like a synthetic test shader so does it really matter?
Cancel
Vote up +1 Vote down

Reply

Accept answer

Reject answer

Cancel

Reply

0 Peter Harris 4 months ago

Looks like some shader compiler inefficiency around moves - not really much you can do to avoid it at the source level as far as I can tell, but this looks like a synthetic test shader so does it really matter?
Cancel
Vote up +1 Vote down

Reply

Accept answer

Reject answer

Cancel

Children

0 rchakena 4 months ago in reply to Peter Harris

Hi peter,

Thanks for the inputs. I tried a more realistic test as below which is accessing gl_FragCord.xy typically seen in real shaders of few games.

And still see quite few CVT (8k) and FMA (8k) for a fragment warp count of 16k. So that is apx 2 FMA (8*4/16) and 2 (8*4/16) CVT instructions per shader invocation. Is this expected?

#version 450 precision

highp float; precision

highp int; layout(location = 0) out vec4 out_color;

layout(location = 0) in vec4 texcoord;

void main() { out_color = vec4(gl_FragCoord.xy, texcoord.xy); }
Cancel
Vote up 0 Vote down

Reply

Accept answer

Cancel
+1 Peter Harris 4 months ago in reply to rchakena
Yes, there is some small cost to converting an integer fragment coordinate to the floating point gl_FragCoord.

For example, the common use case is to use gl_FragCoord to drive texturing rather than using a varying. This (even ignoring scaling the UV to 0-1)...

outColor = texture(texSampler, gl_FragCoord.xy);

... is slower than ...

ivec2 texCoord = ivec2(gl_FragCoord.xy); outColor = texelFetch(texSampler, texCoord, 0);

... which is slower than using a varying and just doing ...

layout(location = 0) in vec2 texCoord; outColor = texture(texSampler, texCoord);
Cancel
Vote up +2 Vote down

Reply

Accept answer

Reject answer

Cancel