Hello Forum,
I've simple Fragment shader running on Immortalis-G715. The shader has no other computations except an equal number of int (cvt) and fp (fma) computations and that too they are independent. The shader core cycles almost equals to cvt + fma instructions but as per documentations cycles should be max(cvt, fma) instructions since they can run paralley in this case. so my query is does shader core cycles just reports cvt+fma instructions but in reality it should be max(cvt, fma)?
Regards,
Venkatesh.
Thanks for the response, Peter. Here I am giving a simple shader case where it has independent int and fp operations and also provided streamline data collected on G715.
in vec4 v_nFade, ivec4 v_nFade1; void main() { vec4 b = vec4(1.3, 2.1, 3.4, 1.03); ivec4 c = ivec4(3, 5, 7, 9); for(int i=0; i<64; i++) { c = v_nFade1 >> c; b = (b * v_nFade); } color = vec4(c) + b; }
The streamline CVT & FMA instructions per invocation data matches with shader program.
The core cycles data streamline reported close to CVT + FMA instructions but as per my understanding it should be max(CVT, FMA) as they are independent and they can execute parallel. Could you clarify it? Thanks.
The pipes are not totally independent in G715. I don't believe the exact behaviour is publicly documented.
I'd be curious what this looks like if you change the integer shift for an add (although the compiler might just pre-multiply because it's integer code - I need to check the disassembly to be sure).