We are running a survey to help us improve the experience for all of our members. If you see the survey appear, please take the time to tell us about your experience if you can.
I used opencl1.1 on Mali628(Exynos5422)。
1.
first I create a buffer
buffer = clCreateBuffer(context, CL_MEM_READ_WRITE | CL_MEM_ALLOC_HOST_PTR, 1280*720*4, NULL, &errorNumber);
next call the kernel to handle the buffer and waiting the command queue finish.
finally, map the buffer to cpu
*pCPU = (unsigned char *)clEnqueueMapBuffer(command_queue,buffer,
CL_TRUE,
CL_MAP_READ,
0,
1280*720*4,
0, NULL, NULL, &errorNumber);
the process buffer map to cpu takes 1843us
2.
change CL_MAP_READ to CL_MAP_WRITE
CL_MAP_WRITE,
I though it will save time , but it still takes 1891us
In my opinion, CL_MEM_ALLOC_HOST_PTR mean map/umap the buffer will only translate the pointer (may been flush the cache) but not to copy memory.why it cost a long time.
thx!
Thank you for your answer.
I had check the config file:
$cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
performance
$cat /sys/devices/system/cpu/cpu1/cpufreq/scaling_governor
$cat /sys/devices/system/cpu/cpu2/cpufreq/scaling_governor
$cat /sys/devices/system/cpu/cpu3/cpufreq/scaling_governor
$cat /sys/devices/system/cpu/cpu4/cpufreq/scaling_governor
$cat /sys/devices/system/cpu/cpu5/cpufreq/scaling_governor
$cat /sys/devices/system/cpu/cpu6/cpufreq/scaling_governor
$cat /sys/devices/system/cpu/cpu7/cpufreq/scaling_governor