Hello,
I am developing embedded software on Zynq MPSOC Cortex-A53 (Armv7/Armv8) for image processing, and I need some help for developing a specific algorithm.
The algorithm involves many calculations of FFT and matrix using. As highest priority, we need to implement an inversion of complex matrix, with large dimension up to 30x30. (By complex matrix, I mean complex float number with real and imaginary parts).
The most significant constraint is obviously the timing constraint: we use to develop our algorithms with ARM NEON SIMD to be faster.
Consequently, I am still looking for a library (compatible ARM) to help me developing this inversion of complex matrix using ARM NEON.
I do not find any library satisfying these 2 constraints:
- Matrix inversion with complex numbers.
- ARM NEON using.
For example, I have studied Ne10 library but it provides inversion of matrix for real numbers but not complex.
Do you know a library (using ARM NEON) I should have a look to help me developing this complex matrix inversion?
Thanking you in advance,
Laurent BOUCHOT.
Hi Laurent,
In ArmPL functions cgetri/zgetri will return the inverse of a complex mtatrix (for single/double precision). That uses a LU factorization, which in turn uses gemm (matrix multiplication), which is heavily optimized (including using NEON instructions), so the performance should be good. Let us know if not!
www.netlib.org/.../group__complex16_g_ecomputational_gab490cfc4b92edec5345479f19a9a72ca.html
Chris.
Hi Chris,I was doing some tests during the last past days.I tried to compare Eigen and ArmPL libraries in terms of execution timing for a double complex matrix inversion of size 24x24.
I did hope to get better measurement using ArmPL... But not really. Our goal was to get around 0,05 ms, but I am not so sure it is possible!
What do you think about it?
Do you have other libraries in mind which I can test and compare?
Any remarks and adivces are welcome!
Thanks.
Regards,
Laurent.
HI Laurent,
Thanks for trying out ArmPL. It's no surprise there's not much of an advantage in this case since 24x24 is a small problem (ArmPL is targeted at HPC) and the A53 is not a core we optimise for any more, however we did used to target that core so if you're interested contact support-hpc-sw@arm.com, and can see if we can point you to one of the old versions with A53-specific optimisations.