This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

How to optimizing sparse matrix multiple vector(SPMV)by  opencl on Mali gpu ?

spmv is a sparse matrix  A multiple a dense vector B and get a dense vector C :  C= A*B

I use CSR sparse matrix format, but the result even slower than the same size dense matrix multiple  a dense vector.

I read same paper and open source library(CLSPARSE),most of them  optimizing for AMD and NVIDIA GPU,not for MALI GPU。 The MALI gpu don't use  warp to excute thread ,so optimize code by warp may not useful for MAlI GPU。

Some paper use BCSR(block csr) to enable  acess memory cache friendly.

May be can use share momory or vectorization(float4 /float8/float16),who did this optimization ,please give some advice.

Parents
  • There are lots of sparse formats available. We even did some research back in 2010 on this with a PhD student from Edinburgh (http://dl.acm.org/citation.cfm?id=1964196) Unfortunately, when he did his internship at ARM, the first Midgard GPU Mali-T604 was only being developed, so we ended up running experiments on NVIDIA and AMD platforms.

    I believe this would still be interesting to study today using a framework for benchmarking and optimisation such as Collective Knowledge (cknowledge.org)

Reply
  • There are lots of sparse formats available. We even did some research back in 2010 on this with a PhD student from Edinburgh (http://dl.acm.org/citation.cfm?id=1964196) Unfortunately, when he did his internship at ARM, the first Midgard GPU Mali-T604 was only being developed, so we ended up running experiments on NVIDIA and AMD platforms.

    I believe this would still be interesting to study today using a framework for benchmarking and optimisation such as Collective Knowledge (cknowledge.org)

Children
No data