As shown in the figure above, the same ComputeShader (the purpose of this CS is to calculate a globally unique and sequentially increasing index based on gl_GlobalInvocationID.xy, namely v2.x) runs right on Redmi RMX2072 (Adreno) but wrong on Huawei LIO-AN00 (arm mali-g76). After some tests, we have not found that Mali (ARM mali) device is running normally.
CompressedASTC_CS_Error and CompressedASTC_CS_Right are the results of running the ComputeShader, which can also be seen in RenderDoc.
Pay attention to the lines after 65537, which are all zeros in CompressedASTC_CS_Error which runs on mali arm.
All needed Files In Mali_CS (Include RenderDoc & All Files Aboved)
Is computeshader(mali) bug ?
Pls, Help me!
I suspect the problem is that on pre-Valhall (including Mali-G76), and Valhall-devices with older drivers, GL_MAX_TEXTURE_BUFFER_SIZE is only 65,536 elements. Because the limit is 64K our shader compiler will only use the low 16 bits of the passed index -- hence when u1 goes beyond this it effectively wraps. This explains why the write to u1 = 65,536 has ended up modifying the first entry (index 0). For these kinds of use-cases where you want to modify large buffers from a single dispatch we'd recommend using SSBOs instead as it supports much larger buffer bindings.
Hope that helps. :)