Please note: We are aware of an issue affecting replies on the Arm Community forums, which may not be loading as expected.

We apologize for any inconvenience and appreciate your patience while we investigate and work to resolve the issue.

Thank you for your understanding.


This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

What are MaliALUInstructionsFMAPipeInstructions in terms of f32 scalar mul/add operations?

Hi,

I'm trying to understand something about the performance of one of my test shaders (I'm using Mali G78 on Pixel 6a - I believe it's 20 cores) and using streamline, I'm getting values as 15 giga instructions per second, with arithmetic unit utilization of around 99%.

According to my calculation, we are processing around 600 giga scalar add/mul per second (counting them in the shader - which I think cannot be optimized -, multiplying by the number of pixels, times 4 because I'm using vec4 and times the fps).

I am not sure how to reconcile the 15 giga instructions/s with my calculated 600 gflops. If I assume an instruction can be run on 32 f32 simultaneously, that would give me 15 * 32 = 480 gflops which is still quite lower than what I estimate with my shader and fillrate.

Thanks,

Lorenzo

Parents Reply Children
No data