We are running webgl confromance test on Mali T-760 (Samsung Galaxy S6) on Chrome on Android.
But, the shader while handling the uniform with 10000 operators hangs and the gpu timeouts (after 10secs).
The same test runs fine on Adreno (TM-530).
test - https://www.khronos.org/registry/webgl/sdk/tests/conformance/glsl/bugs/temp-expressions-should-not-crash.html
Hey Pete,
Thanks for you reply.
The Chrome app is written such that for gpu cmds not finishing withing 10secs, gpu will timeout.
For this shader test, it does hit timeout. But, if we avoid timeout (through setting), it finally hangs and crashes the browser with out of memory.
Strangely for Adreno drivers, the test succeeds.
Do we have a workaround for this ?
Thanks,
Sohan
Is there a specific conformance test failing?
Cheers, Pete
Yes. https://www.khronos.org/registry/webgl/sdk/tests/conformance/glsl/bugs/temp-expressions-should-not-crash.html this glsl test hangs.
There are other failures, but it doesn't cause a hang/crash.
Is there any workaround that you can suggest to avoid this hang ?
Not one which will make it function correctly; sorry. The downside of very heavy workloads on a mobile platform is that you are either going to timeout or run out of memory.
In general we'd recommend that test cases should look like the workload you expect a realistic application to run - this test case doesn't sound like it does that.
Hmm. I see. Thanks.
I wonder how Adreno takes care of it ?
If I'm reading that test correctly, the shader compiler could possibly eliminate most of that code. E.g., -uniform/uniform equals -1,-1,-1,-1 .. +unless you have zeros, but if I remember right, division by zero is undefined in the glsl es). If you multiply that with uniform, you get -uniform... add that to +uniform and you get 0,0,0,0 and you can eliminate that whole piece of code.
It may be possible that Adreno compiler is optimizing this out and then you end up really executing almost a NOP shader. Which obviously runs in smaller time (no timeout).
But... in my not so humble opinion, if you write this sort of code in a real world app (where this sort of optimizations are possible)... you should try a new hobby ;-)
Also, you can bring almost any GPU to 'hang' in a single draw call that renders 65536-2 fullscreen triangles that have the most expensive fragment shader that the GPU can handle with blending enabled (to avoid any optimization).
In a heavily simplified calculation for 1080p and assuming 100 cycle per pixel fragment shader... You get about 13600 Giga cycles to render that draw call. Assuming a gigahertz single core GPU, that would take roughly 3.8 hours to complete.