Some questions about MSAA on mali-G77

Hi, I'm a mobile game developer and I try to use MSAA 4x in my game recently. As far as I know, MSAA is almost "free" on Mali GPU. I use UE 4.27 and build a demo to profile the performance.

Demo is using forward render pipeline (Vulkan) and material of scene object is using unlit shading model which is just using its world space normal value as its pixel color.

The profile result shows that GPU Active increase about 20%! The Fragment queue active also increase 20%.

I understand that using MSAA 4x will make more primitives going to the rasterizer and create more quads because there are 4 times sample points within a pixel.
What make me confused is the increament of Fragment Warp/Execution Core Active is not the same to GPU Active/Fragment queue active. Increasement of Fragment Warp is about 3% and Execution Core Active is about 30%.

Since all objects in my demo scene are using the same simple material, I expect that when the workload(e.g. fragment warps) increased by A%, the gpu active should also increased around by A% or even less thant that. But it seems not true according to the profiling result.

There's something even stranger that after using MSAA 4x, the usage of varying unit and texture unit are decreasing!? (More warps but less varing/textureing ????)

So, my questions are:
1. Is MSAA not "free" actually? The increase of GPU Active (20% ~ 30%) is expected?
2. Why the growth rates of Fragment Warp/Execution Core Active/GPU Active are different?
3. What's going on with each unit when using MSAA 4x?