What's the difference in linking and usage between libarmpl.a and libarmpl_mp.a?
Anything special is required of linking/employing libarmpl_mp.a in user's engine? Under almost the same environment, Is it enough to replace Intel MKL functions with ARM corresponding ones to achieve high performance?
Same question here, but with different and linked issue: performance: single thread looks OK while multi-thread poor . - High Performance Computing (HPC) forum - Support forums - Arm Community
It's never enough to just replace a function call. To achieve highest performance for a particular machine, the code must be carefully tailored to that machine.