I am using the odroid XU3 board. It has the Samsung Exynos5422 SoC. The SoC has BigLittle CPUs and the Arm Mali T628 MP6 GPU. I would like to run the CPU and the GPU in parallel on different sections of the array that I am processing. Currently to enforce coherency I have to use clEnqueueMapBuffer and clEnqueueUnmapMemObject. Usage of these OpenCL functions gives a performance degradation and due to this running the CPU and GPU in parallel becomes pointless. I have the following questions.
1) Are the caches of the ARM CPU and the GPU on this SoC coherent ?
2) The GPU shows up as two devices in OpenCL. Are these GPUs cache coherent ?3) Is there anyway to enforce coherency other than using MapBuffer and UnmapMemObject CL functions ?
I will repeat what peterharris has already said:
ARM just license the GPU IP to our silicon partners; the achievable top frequency depends on many aspects of physical implementation. This question is best aimed at the supplier of a specific chip.
We cannot answer that question as it is not controlled by us. Please ask your silicon provider of your targeted SoC to see if they can run the GPU at that clock frequency.
thank you for your answer.