I have a Unity shader using the multi-compile keyword. I am trying to replace it with a uniform flow-control in order to reduce the number of variants.
I have 4 questions.
Q1: I cannot understand the output of MALIOC (Mali-G71).
Arithmetic Cycles of Fragment shader (in all cases Total Cycles==Shortest Path Cycles==Longest Path Cycles)
- Without the keyword: 7.50
- With the keyword: 7.65
- Uniform flow-control: 7.50
It seems to me that MALIOC reports the cycles of shader with uniform flow-control by assuming the uniform value, and thus only computes the cycles of a path.
If the instructions of both paths are executed, the cycles should be much longer.
Q2: Is uniform flow-control so terrible as described here ? https://developer.arm.com/documentation/101897/0200/shader-code/uniform-control-flow
Q3: May we assume that the driver optimises the shader on-the-fly based upon the uniform value so that only one branch will be executed (I guess not) ?
Q4: Which GPU counters should I check in Streamline for the potential problems of uniform flow control ? According to my experiment, the "Diverged instructions" are almost none in all cases.