Hi everyone,
I'd like to know what happens when I use the command clEnqueueMapBuffer to the hardware level.
All the buffer at CPU-side cache is invalidated?
And when I use the command clEnqueueUnmapMemObject,
All the buffer at GPU-side cache is invalidated?
Thanks!
CL_MEM_ALLOC_HOST_PTR is just a hint on the driver, if you don't set it the driver might think you won't access the memory from the host side and therefore might use CPU uncached memory.
In practice the driver might just ignore this flag and cache the memory anyway which is why you might not always see any performance difference.
Map / Unmap are pretty much just some calls in the dma cache maintenance routines of the Linux kernel (I'm sure you can find some information about that in the Linux Kernel documentation if you're interested).