We are running a survey to help us improve the experience for all of our members. If you see the survey appear, please take the time to tell us about your experience if you can.
I recently compared Sparse matrix vector multiplication (SpMV) performance of armpl library and native implementation.
The performance of armpl_spmv_exec_d function is the same as native implementation for a sparse matrix in CSR format with dimension of 16M x 16M.
Is that expected?
The compilation is with arm compiler for linux (acfl) with flag '-Ofast -mcpu=native' to enable sve and fast math.
-Ofast -mcpu=native' to enable sve and fast math.