We are running a survey to help us improve the experience for all of our members. If you see the survey appear, please take the time to tell us about your experience if you can.
I cannot find any information on the number of CPU cycles it takes to execute a 1024 Complex FFT, 32-bit floating-point data size, on an R52+ using Neon. Assume that the code executes from TCM and all data is in TCM.
Also, I see examples of 4x4 matrix multiply, but no information on the number of CPU cycles it takes.
Is there an answer to this question?