Hi everyone,
Recently I have been working on a GPU application. My application will run on Arndale board and will use Mali GPU. To make program execution faster I wanted to do memory optimization. Based on the OpenCL guide, using CL_MEM_ALLOC_HOST_PTR should be used to improve performance. Using of CL_MEM_USE_HOST_PTR is discouraged.
But from my experiment, I found that using of CL_MEM_USE_HOST_PTR actually reduce data transfer time. but increase kernel execution overhead. From my experiement, I found that data copy is inevitable in both cases (CL_MEM_ALLOC_HOST_PTR and CL_MEM_USE_HOST_PTR).
Can anyone confirm? Is it possible at all to have a zero copy?
It has been said in the mali OpenCL guide that using CL_MEM_ALLOC_HOST_PTR requires no copy. But there is a copy. Let’s say I have a pointer A. I created a buffer using CL_MEM_ALLOC_HOST_PTR. To have the data of A available to the GPU, I have to do a memcpy to transfer data from A to the allocated space I get using CL_MEM_ALLOC_HOST_PTR.
So, data copy is needed. Is there a way to access the data directly from GPU without any copying?
PS: I have attached my code for your feedback.
UPDATE:: I have uploaded a version with HOST_ALLOC_PTR for your review.
This is the code snippet:
#endif
Hi Michael,
Thanks for the detailed reply.