1、I found a strange problem, I tested the following two kernels, the first kernel shows in picture one is shorter than the second kernel shows in picture two.Test platform is Mali -T864.GlobalWorkSize=10000000(10M),The first takes 15ms and the second takes 20ms.
pic 1
pic 2
2、I use mali_offline_compiler to profile them,the two are same shows in pic 3,how to get Instructions Emmited and Path Cycles?Why Instructions Emmited is twice than Longest Path Cycles ?And in my opinion, the L/S operation should be 3 times,Why four times here?
Thanks for your reply,I've asked the question again.
community.arm.com/.../mali_offline_compiler-question
So please mark this one as resolved: