Not sure if this is the right place to ask this question but how do I disable watchdog timer on my Note3 which has a Mali-T628? All I could find was dvfs stuff inside /sys/devices/platform/mali.0. Is there something in the user-mode driver I could modify and recompile?
I am trying to run some GLES 3.0 benchmarks and if I increase the number of iterations inside the pixel shader beyond a certain limit, I get a black framebuffer. I figured it must be related to the watchdog timer.
The problem I am facing is that if each thread (all threads must do the same amount of work) reads more than 2048 texels then I get black results but not the correct number of 4 bilinear filtered pixels/clock
What is your shader doing? I suspect the black results are due to a precision problem in your shaders exceeding the maximum representable range of a variable. Are you able to share?
Most graphics shaders are very short - even high end content like the GFXBench 3.0 Manhattan test typically only uses a handful of texture accesses - so if you have to many unique accesses I wonder if you are hitting some other limit unrelated to the main texturing unit.
But then the driver seems to be optimizing all these exactly same drawcalls writing to the same framebuffer.
Multiple opaque drawcalls to the framebuffer won't work - we can kill the overdrawn pixels in hardware - see Killing Pixels - A New Optimization for Shading on ARM Mali GPU.Try turning on blending, as this forces us to keep the overdrawn fragments (we need their color to blend against).
Cheers,Pete
Hmm, my shader is fetching from a texture inside a loop and writing out the results once to the framebuffer. All the results are added inside a vec4 variable. I should try using highp instead of lowp/mediump qualifiers then.
On Mali mediump is fp16 precision - so the dynamic range is quite small, and if you start using a significant number of bits to represent non-fractional digits you rapidly run out of the fractional part. Try highp - it sounds like it might help.
Okay so I tried both suggestions :
-Use highp for the output color and intermediate variable - the black output is still present on increasing the texel fetches > 2k
-Enabled blending to prevent pixel killing optimization
And still no luck :[
Anyways, It's a good thing you guys report the texel fill rate which is 1 bilinear/clock/unit and 1/2 triliear/clock/unit. And also FP16 is full-rate which I measured.