We are running a survey to help us improve the experience for all of our members. If you see the survey appear, please take the time to tell us about your experience if you can.
I'am trying use opencl to replace some matrix multiplication and vector compute. but the gpu always slower cpu 2--4 times.
In cpu we use neon simd,in gpu i also use vector like float4,float16 .
I do those test on MT6753(ARM-A53 @ 1.5GHz,Mali T720)
the data types are all float .
who have do same jobs on Mali gpu,I wonder is there any optimize possble?
I wish GPU would fast than CPU. If you done some jobs which fast than CPU on GPU please help me.
I need some base data between mali cpu and gpu compute performance eg : GFLOPS and so on
Fist I map the memobj to cpu,and write data in it, then unmap the memobj,run opencl kernel ,after kernel finish ,map memobj,read result.
the memobj created by CL_MEM_ALLOC_HOST_PTR, I also think maybe I can use fillbuffer ,read buffer and write buffer.
I know I can do some optimize :
1. choose a better memobj create way and read/write or copy memobj
2. rewrite the kernel function
please give some basic performance data,I really don't know how fast I can get in GPU.If you have some experience on this please help me.