This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

opencl map to cpu cost long time

I used opencl1.1 on Mali628(Exynos5422)。

1.

first I create a buffer

buffer = clCreateBuffer(context, CL_MEM_READ_WRITE | CL_MEM_ALLOC_HOST_PTR, 1280*720*4, NULL, &errorNumber);

next call the kernel to handle the buffer and waiting the command queue finish.

finally, map the buffer to cpu

*pCPU = (unsigned char *)clEnqueueMapBuffer(command_queue,buffer,

            CL_TRUE,    

            CL_MAP_READ,

            0,

            1280*720*4,

            0, NULL, NULL, &errorNumber);

the process buffer map to cpu takes 1843us

2.

change CL_MAP_READ to CL_MAP_WRITE

*pCPU = (unsigned char *)clEnqueueMapBuffer(command_queue,buffer,

            CL_TRUE,    

            CL_MAP_WRITE,

            0,

            1280*720*4,

            0, NULL, NULL, &errorNumber);

I though it will save time , but it still takes 1891us

In my opinion, CL_MEM_ALLOC_HOST_PTR mean map/umap the buffer will only translate the pointer (may been flush the cache) but not to copy memory.why it cost a long time.

thx!

Parents
  • Hi,

    You're right map/unmap doesn't copy any memory, what takes time is the CPU cache maintenance.

    In order to make sure this is done as quickly as possible make sure your CPUs are set to run in performance mode:

    echo "performance" > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor

    Hope this helps,

    Anthony

Reply
  • Hi,

    You're right map/unmap doesn't copy any memory, what takes time is the CPU cache maintenance.

    In order to make sure this is done as quickly as possible make sure your CPUs are set to run in performance mode:

    echo "performance" > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor

    Hope this helps,

    Anthony

Children