We are running a survey to help us improve the experience for all of our members. If you see the survey appear, please take the time to tell us about your experience if you can.
Hi everyone,
Recently I have been working on a GPU application. My application will run on Arndale board and will use Mali GPU. To make program execution faster I wanted to do memory optimization. Based on the OpenCL guide, using CL_MEM_ALLOC_HOST_PTR should be used to improve performance. Using of CL_MEM_USE_HOST_PTR is discouraged.
But from my experiment, I found that using of CL_MEM_USE_HOST_PTR actually reduce data transfer time. but increase kernel execution overhead. From my experiement, I found that data copy is inevitable in both cases (CL_MEM_ALLOC_HOST_PTR and CL_MEM_USE_HOST_PTR).
Can anyone confirm? Is it possible at all to have a zero copy?
It has been said in the mali OpenCL guide that using CL_MEM_ALLOC_HOST_PTR requires no copy. But there is a copy. Let’s say I have a pointer A. I created a buffer using CL_MEM_ALLOC_HOST_PTR. To have the data of A available to the GPU, I have to do a memcpy to transfer data from A to the allocated space I get using CL_MEM_ALLOC_HOST_PTR.
So, data copy is needed. Is there a way to access the data directly from GPU without any copying?
PS: I have attached my code for your feedback.
UPDATE:: I have uploaded a version with HOST_ALLOC_PTR for your review.
This is the code snippet:
#endif
The second reason it's slow is because some initialisation operations are deferred to the first time an object is actually used.
Also in a real life application you would allocate your buffers once then map/unmap them at every frame, so if you want to make a realistic test case you should do something like
createBuffer();
for(int i=0;i<100; i++){
timer_start();
map();
fill_buffer();
unmap();
enqueue_kernel();
finish():
timer_end();}
releaseBuffer();
When doing that you should observe that the first iteration will take more time because of what I explained above, then all the following iterations should be much faster.