This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Low GFLOPS on Mali-T628 MP6 using OpenCL

Hi,

I was using SHOC to benchmark the Arndale Octa board and the newest Samsung Galaxy Note 10.1, both of which have Mali-T628 MP6 inside.

What I focused on is the performance of GFLOPS based on MAdd_{1,2,4,8,16}, and the results range from 2 ~ 24 GFLOPS, which seems unreasonable.

According to the theoretical calculation, T628 MP6 should have around 102 GFLOPS.

It seems not to be the problem of the benchmark program since this program reported 30~40 GFLOPS on Nexus 10 (Mali-T604 MP4).

That is, T628 MP6 had worse GFLOPS than T604 MP4 when running MaxFlops test item in SHOC.

How come can this happen?

Yoshi

Parents
  • Hi Yoshi,

    I would certainly follow Pete's recommendations regarding DVFS.  It's worth adding that although Mali-T628 MP6 does have 6 GPU cores - 2 more than in the Nexus 10 - those 6 cores are split 4:2 between two core-groups.  OpenCL will see these as two devices and won't automatically split the job across both groups, though it will by default however run the job on the 4-core group.  So - depending on relative GPU frequency, and taking any DVFS issues into account - I would expect the performance between the two devices to be roughly equivalent.

    Hope that's useful,

    Tim

Reply
  • Hi Yoshi,

    I would certainly follow Pete's recommendations regarding DVFS.  It's worth adding that although Mali-T628 MP6 does have 6 GPU cores - 2 more than in the Nexus 10 - those 6 cores are split 4:2 between two core-groups.  OpenCL will see these as two devices and won't automatically split the job across both groups, though it will by default however run the job on the 4-core group.  So - depending on relative GPU frequency, and taking any DVFS issues into account - I would expect the performance between the two devices to be roughly equivalent.

    Hope that's useful,

    Tim

Children
No data