Hello,
I am developing embedded software on Zynq MPSOC Cortex-A53 (Armv7/Armv8) for image processing, and I need some help for developing a specific algorithm.
The algorithm involves many calculations of FFT and matrix using. As highest priority, we need to implement an inversion of complex matrix, with large dimension up to 30x30. (By complex matrix, I mean complex float number with real and imaginary parts).
The most significant constraint is obviously the timing constraint: we use to develop our algorithms with ARM NEON SIMD to be faster.
Consequently, I am still looking for a library (compatible ARM) to help me developing this inversion of complex matrix using ARM NEON.
I do not find any library satisfying these 2 constraints:
- Matrix inversion with complex numbers.
- ARM NEON using.
For example, I have studied Ne10 library but it provides inversion of matrix for real numbers but not complex.
Do you know a library (using ARM NEON) I should have a look to help me developing this complex matrix inversion?
Thanking you in advance,
Laurent BOUCHOT.
Hi Laurent,
Have you taken a look at the Arm Compute Library and the Arm Performance Libraries?
Here are some links to start with:
https://community.arm.com/graphics/b/blog/posts/arm-compute-library-for-computer-vision-and-machine-learning-now-publicly-available
https://developer.arm.com/technologies/compute-library
https://github.com/ARM-software/ComputeLibrary
https://developer.arm.com/products/software-development-tools/hpc/arm-performance-libraries
The exact code you need may not be included, but it will give you a good idea about how to optimize for NEON and see matrix and FFT examples.
Thanks,
Jason
Hi Jason,
Thanks a lot for your answer.
In fact, I was currently looking at both ARM libs (ACL specifically because it is open source, and thenAPL).
In parallel, I have found a theorical formula to split complex numbers and invert a large matrix (= double size) of real numbers instead.
The problem is: I am trying to find a API (using NEON) proposing a matrix inversion of large size (any size, not 2x2 or 3x3, larger than that), and I do not find it.
I am not sure ACL library proposes matrix inversion. Does it? Do you know the name of the class/module doing it? I am still trying to get it.
I know that OpenCL proposes it, but I am not sure ACL does.
Thanking you in advance.
Regards,
Laurent
In ArmPL functions cgetri/zgetri will return the inverse of a complex mtatrix (for single/double precision). That uses a LU factorization, which in turn uses gemm (matrix multiplication), which is heavily optimized (including using NEON instructions), so the performance should be good. Let us know if not!
www.netlib.org/.../group__complex16_g_ecomputational_gab490cfc4b92edec5345479f19a9a72ca.html
Chris.
Hi Chris,I was doing some tests during the last past days.I tried to compare Eigen and ArmPL libraries in terms of execution timing for a double complex matrix inversion of size 24x24.
I did hope to get better measurement using ArmPL... But not really. Our goal was to get around 0,05 ms, but I am not so sure it is possible!
What do you think about it?
Do you have other libraries in mind which I can test and compare?
Any remarks and adivces are welcome!
Thanks.
Laurent.
HI Laurent,
Thanks for trying out ArmPL. It's no surprise there's not much of an advantage in this case since 24x24 is a small problem (ArmPL is targeted at HPC) and the A53 is not a core we optimise for any more, however we did used to target that core so if you're interested contact support-hpc-sw@arm.com, and can see if we can point you to one of the old versions with A53-specific optimisations.