Query on CVT perf counter on G720 and G725

Hello Forum,

I’ve noticed that the CVT pipeline counter is active for FP16 floating-point operations on Mali G725, but not on Mali G720.

Example:

precision mediump float;

in vec4 in1, in2;

out vec4 col;

main( ) { col = in1 * in2; }

cvt fma sfu narrow warps
G720 0 8320 0 8320 16324
G725 8247 8249 0 16473 16448

The shader performs 4 FP16 operations (vec4 multiply). On G720, only the FMA unit is used, which makes sense because FP16 operations on Mali are typically executed as vec2 SIMD, so this results in two instructions per thread.

However, on G725, the CVT unit is also used, and its count matches the FMA count. Why does G725 require CVT instructions for this case?

Thanks,

Venkatesh.