I have to decide which approach we use for instancing.- Vertex attribute instancing- UBO- SSBOAccording to the docs, on the one hand they say:Do- Use a single interleaved vertex buffer for all instance data.- Use instanced attributes to work around limitation of 16KB uniform buffers.- If feasible, try to use a power-of-two divisor, i.e. number of vertices per instance should be power-of-two.- Prefer indexed lookups using gl_InstanceID into uniform buffers or shader storage buffers if the suggestions here cannot be followed.- Prefer instanced attributes if instance data can be represented with smaller data types, as uniform buffers and shader storage buffers cannot utilize these denser data types.So it seems to say that vertex attributes ARE preferred IF we meet the requirements.However, on another place of the website it says this:- Prefer indexed lookups using gl_InstanceID into uniform buffers or shader storage buffers. Rather than per-instance attribute data.Unconditionally, like UBOs and SSBOs are always preferred to vertex attributes.What is the preferred approach even if we meet the requirements above?Thanks!
I suspect the best answer is situational, depending on GPU and exactly what the content is doing with it.
For any modern Arm GPU, personally I would always use a UBO/SSBO array lookup. I can't think of any situation where it would be substantially worse than a per-instance attribute lookup, unless you are doing something pathological.
I see... well, the point is which situations are pathological? :)
Is it OK to use an UBO of the max size for instance data?
> well, the point is which situations are pathological? :
Shaders accessing enough buffers and/or attributes to start thrashing descriptor caches. Shaders using robust buffer access may be slower for indexed array lookups over fixed-function lookups.
My main point was that it is unlikely that this is going to be the major performance cost of your application - if per-instance data ends up being more expensive to access than the much more voluminous per-vertex data you're doing something very strange =)
Total data size shouldn't be a major difference - the underlying data cache will be the same for both routes, so if you have a problem for one the other would have similar issues. Note that on some older GPUs the maximum UBO size can be quite small (16KB) compared to buffers, so that might be a reason to prefer per-instance attribute loads.