Dear All,
One of my use cases of ARM Mali graphics is running Video(HEVC) Decode Kernels. But, what we discover is that the OpenCL Kernel call APIs clEnqueueNDRangeKernel and clEnqueueTask overhead is much higher than the execution time of the kernel. This reduces the overall Video decoding speed considerably.
Is there anything we can do to reduce this overhead ? Any tips ? Or if you need more details about the issue, I can explain.
Regards
Paul
Hi, Anthony
It works to employ the prebuilt binary file. Time consumption reduced from 10+ms to 100+us.
Thumbs up
Irving