Graphics, Gaming, and VR forum Load/Store Unit and 16-bit arithmetic from mali oc are not as expected

State Accepted Answer
Replies 6 replies
Subscribers 137 subscribers
Views 514 views
Users 0 members are here

Options

Related

How was your experience today?

Load/Store Unit and 16-bit arithmetic from mali oc are not as expected

Yiyuan Wang 2 months ago

I profiled my shaders, but i found the Load/Store Unit value is extremely large. Therefore, I tried to simplify the shader and run some tests.

Environment: Mali-G715, glslc in the latest Vulkan SDK .

#version 450
#define LENGTH  512 // 1024

layout(set = 0, binding = 0, std140) mediump uniform ubo0 {
mediump vec4 data[LENGTH];
} _ubo0;

layout(set = 0, binding = 1, std140) mediump uniform ubo1 {
mediump vec4 data[LENGTH];
} _ubo1;

layout(location = 0) out mediump vec4 outColor;

void main() 
{
    outColor = vec4(0);
    for(int i = 0; i < LENGTH; i++)
    {
        outColor+= _ubo0.data[i];
    }
    
    //for(int i = 0; i < LENGTH; i++)
    //{
    //    outColor+= _ubo1.data[i];
    //}
}

void confusedMain() 
{
    outColor = vec4(0);
    for(int i = 0; i < LENGTH; i++)
    {
        outColor+= _ubo0.data[i];
        outColor+= _ubo1.data[i];
    }
}

The profile result:

Uniform Count	Per Unifrom Length	LS	16-bit arithmetic	Uniform Register	Function
1	512	0.00	N/A	2 (3% used)	main
1	1024	2.00	0.0	2 (3% used)	main
2	512	0.0	N/A	2 (3% used)	main
2	1024	4.00	0.0	2 (3% used)	main
2	512	4.00	0.0	2 (3% used)	confusedMain

The results seem to indicate that the LS value is related to the size of the UBO. However, when I tried the following code, the results confused me.

So I have some questions about the result above.

Q1: Does the size of the UBO really affect LS? Could it be that there is a special cache inside the chip, but due to the limited cache size, a large UBO increases LS?

Q2: Why different UBO size have different 16-bit arithmetic result?

Q3: Why did different calculation orders produce different results in the example above？

Q4: Why does the UBO size affect 16-bit arithmetic?

Top replies

Peter Harris 2 months ago +1 verified

Yiyuan Wang said: Q1: Does the size of the UBO really affect LS? Could it be that there is a special cache inside the chip, but due to the limited cache size, a large UBO increases LS? The result of...

Parents

0 Yiyuan Wang 2 months ago in reply to Peter Harris

Thanks for your answer, I also have a question about the precision of uv. I found that if a variable is used as UV to sample texture, then all its dependent variables will be increased in precision, even if I use texturelod to not calculate ddx and ddy. Is this a hardware limitation? If I use Texelfetch, this phenomenon will not occur.
Cancel
Up 0 Down

Reply

Accept answer

Cancel

Reply

0 Yiyuan Wang 2 months ago in reply to Peter Harris

Thanks for your answer, I also have a question about the precision of uv. I found that if a variable is used as UV to sample texture, then all its dependent variables will be increased in precision, even if I use texturelod to not calculate ddx and ddy. Is this a hardware limitation? If I use Texelfetch, this phenomenon will not occur.
Cancel
Up 0 Down

Reply

Accept answer

Cancel

Children

0 Peter Harris 2 months ago in reply to Yiyuan Wang

Correct, all float texture coords get promoted to highp. Decent linear filtering needs 8 bits of subtexel precision at after any wrapping has been applied, so fp16 is almost never usable.

Integer coords don't need subtexel precision, as there is no filtering, so texelFetch is fine.
Cancel
Up 0 Down

Reply

Accept answer

Cancel
0 Yiyuan Wang 2 months ago in reply to Peter Harris

Thank you, and I understand why texturelod requires high-precision UV coordinates. However, I find it somewhat unreasonable that all dependent variables automatically get their precision increased. I think this behavior of increasing the precision of variables should have a range of influence, but I didn't find it. Or Is there any way to stop this precision increase from propagating?
Cancel
Up 0 Down

Reply

Accept answer

Cancel
0 Peter Harris 2 months ago in reply to Yiyuan Wang

Yiyuan Wang said:
Or Is there any way to stop this precision increase from propagating?

Not that I know of.
Cancel
Up 0 Down

Reply

Accept answer

Cancel
0 Peter Harris 2 months ago in reply to Peter Harris

P.S. Happy to review your shader and provide some advice if you are able to share. You can contact the team at developer@arm.com if you can't share publicly.
Cancel
Up 0 Down

Reply

Accept answer

Cancel