Hello Forum,
Example:
precision mediump float;
in vec4 in1, in2;
out vec4 col;
main( ) { col = in1 * in2; }
The shader performs 4 FP16 operations (vec4 multiply). On G720, only the FMA unit is used, which makes sense because FP16 operations on Mali are typically executed as vec2 SIMD, so this results in two instructions per thread.
However, on G725, the CVT unit is also used, and its count matches the FMA count. Why does G725 require CVT instructions for this case?
Thanks,
Venkatesh.