This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

compute performance between cpu and mali gpu?

I'am trying use opencl to replace some matrix multiplication and vector compute. but the gpu always slower cpu 2--4 times.

In cpu we use neon simd,in gpu i also use vector like float4,float16 .

I do those test on MT6753(ARM-A53 @ 1.5GHz,Mali T720)

the data types are all float .

who have do same jobs on Mali gpu,I wonder  is there any optimize possble?

I wish GPU would fast than CPU. If you done some jobs which fast than CPU on GPU please help me.

I need some base data between mali cpu and gpu compute performance eg : GFLOPS and so on

Fist I map the memobj to cpu,and write data in it, then unmap the memobj,run opencl kernel ,after kernel finish ,map memobj,read result.

the memobj created by CL_MEM_ALLOC_HOST_PTR, I also think maybe I can use  fillbuffer ,read buffer and write buffer.

I know I can do some optimize :

1. choose a better memobj create way and read/write or copy memobj

2. rewrite the kernel function

please give some basic performance data,I really don't know how fast I can get in GPU.If you  have some experience on this please help me.