This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Why uniform buffer is limit to so small in <<Arm GPU Best Practices Developer Guide>>?

https://developer.arm.com/documentation/101897/0301/Shader-code/Uniforms

it said that

"Keep your uniform data small. 128 bytes is a good general rule for how much data can be promoted to registers in any given shader."

Seems to be too small, opengles ubo max size is 16kb.

128bytes is 32 float, easy to exceed that count.

because the mobile gpu sgpr register file is too small?

Parents
  • Mali can promote uniforms to constant registers, which are effectively free to access (no per-thread load in the shader code). Arm GPUs in the Bifrost family or newer actually have 512 bytes of uniform storage (used by both user uniforms and driver-managed uniforms), so the actual limit on modern hardware is a lot higher than the 128 bytes in the document.

    If you use more uniforms than this it works fine, but falls back to memory reads from the uniform buffer in shader code, which will be slightly slower and put more pressure on work register storage.

    Seems to be too small, opengles ubo max size is 16kb.

    I doubt any single shader is actually loading 16KB of uniform data though - that's going to go slow on any GPU architecture ...

    Cheers,
    Pete

Reply
  • Mali can promote uniforms to constant registers, which are effectively free to access (no per-thread load in the shader code). Arm GPUs in the Bifrost family or newer actually have 512 bytes of uniform storage (used by both user uniforms and driver-managed uniforms), so the actual limit on modern hardware is a lot higher than the 128 bytes in the document.

    If you use more uniforms than this it works fine, but falls back to memory reads from the uniform buffer in shader code, which will be slightly slower and put more pressure on work register storage.

    Seems to be too small, opengles ubo max size is 16kb.

    I doubt any single shader is actually loading 16KB of uniform data though - that's going to go slow on any GPU architecture ...

    Cheers,
    Pete

Children
  • Thanks for reply.

    I use mali Offline Compiler to analyse the shader, it report the Uniform registers: 128 (200% used)

    It means 512 bytes?

    Do i need to optimized it to 32?

    Desktop gpu seems have 16k SGPR register file. I don't really want the mobile gpu have 16kb uniform, but 128 bit seems too small,

    if have 512bytes for only user uniforms, it's not bad.

  • > I use Mali Offline Compiler to analyse the shader, it report the Uniform registers: 128 (200% used)

    The "200%" is a bug in the tool - it's off by a factor of 2 and should report 100% used (fixed for the next release due in the Autumn). But yes, 128 registers = 512 bytes. 

    There isn't a hard drop in performance if you exceed the uniform register storage, so if you are happy with the performance of your shader then you are probably OK. 

    Note that the uniform registers can pack two 16-bit values into a register, so  using mediump/RelaxedPrecision annotation can help reduce uniform register usage.

  • Thanks.
    I feel relaxed now.