Mali G78 VK pass performance

Hello - we are bringing up our unittests on Mali and have hit what appears to be odd performance under Android Vulkan (Pixel6).

A test for timing queries performs the following:

The test makes a render target (32x32). And does the following:

-sets a timer query A
-performs 128 passes clearing that target (Load action Clear, Store action Store)
-sets a timer query B
-performs 128 passes clearing the target  (Load action Clear, Store action Store)
-sets a timer query C
-resolve 3 queries.

(The logic is to give the driver busywork between the timing queries.)


Then it validates:

B > A  (time didn't go backwards)
C > B (time didn't go backwards)
and
B < (A + 16 ms)
C < (B + 16 ms)
^^ i.e. "Rendering didn't take a silly amount of time".

The numbers we're getting back for G78 are on the order of 40ms+ to perform those 128 pass clear loops.  The render target is 32x32 so quite small.  
For a loop of only 10 passes, we get timings anywhere from 0.3ms to 2ms.

Are passes specifically expensive on Mali that this test shouldn't be considered valid? These timings are enormous compared to what we see on other architectures.

These are the device details:

[INFO ]: Loading libvulkan.so...
[INFO ]: Vulkan API Version : 1.1.189
[INFO ]: Vulkan Device 0 : Mali-G78
[INFO ]: Driver Version : 32.1.0
[INFO ]: Driver's API Version : 1.1.177