I'd like to know what it means when a fragment shader is bound by Varying unity (V), in our case.
According to: https://developer.arm.com/documentation/101863/7-4/Mali-GPU-pipelines/Mali-Bifrost-architecture
The varying pipeline is a dedicated pipeline which implements the varying interpolator.
Does it mean that the it takes a lot of cycles just interpolating the varyings than ALU operations, and reducing the amount of varyings could potentially reduce the fragment shader cycles ?
For example:
Mali Offline Compiler v7.4.0 (Build 330167)Copyright 2007-2021 Arm Limited, all rights reserved
Mali Offline Compiler v7.4.0 (Build 330167)
Copyright 2007-2021 Arm Limited, all rights reserved
Configuration=============
Configuration
=============
Hardware: Mali-G71 r0p1Architecture: BifrostDriver: r32p0-00rel0Shader type: OpenGL ES Fragment
Hardware: Mali-G71 r0p1
Architecture: Bifrost
Driver: r32p0-00rel0
Shader type: OpenGL ES Fragment
Main shader===========
Main shader
===========
Work registers: 24Uniform registers: 12Stack spilling: false16-bit arithmetic: 60%
Work registers: 24
Uniform registers: 12
Stack spilling: false
16-bit arithmetic: 60%
A LS V T BoundTotal instruction cycles: 1.42 0.00 3.50 2.00 VShortest path cycles: 1.42 0.00 3.50 2.00 VLongest path cycles: 1.42 0.00 3.50 2.00 V
A LS V T Bound
Total instruction cycles: 1.42 0.00 3.50 2.00 V
Shortest path cycles: 1.42 0.00 3.50 2.00 V
Longest path cycles: 1.42 0.00 3.50 2.00 V
A = Arithmetic, LS = Load/Store, V = Varying, T = Texture
This makes sense and surprises me at the same time. Thank you a lot because I would never figure it out on my own. - Is there any counter in Streamline which help us detect this kind of promotion- Is there any other precision surprises we should expect from the compiler ?- Is there anyway to defeat this optimization ? (say, the texture is very small)