Hi to you all,I've a firmware running on a NXP LPCLink2 (LPC4370: 204 Mhz Cortex M4 MCU) board which basically does this:
My problem is that my code is too slow, and every now and then and overwrite occurs.
Using the DMA I'm saving the ADC data, which I get in Twos complement format (Offset binary is also available), in a uint32_t buffer and try to prepare them for the CMSIS DSP function by converting the buffer into float32_t: here's where the overwrite occurs. It's worth saying that I'm currently using Floating point Software, not hardware.
The CMSIS library also accepts fractional formats like q31_t, q15_t and so on, and since I don't strictly need floating point maths I could even use these formats if that could save me precious time.It feels like I'm missing something important about this step, that's no surprise since this is my first project on a complex MCU, any help/hint/advise would be highly appreciated and would help me in my thesis.
I'll leave here the link for the (more datailed) question I asked in the NXP forums, just in case: LPC4370: ADCHS, GPDMA and CMSIS DSP | NXP Community .
Thanks in advance!
It's great to hear about the optimization results.
abet wrote:Today I did some tests using the -O3 optimization level for my project and the result is great (using thibaut's function with no sign extension): the elapsed time for 128 sample is roughly 18us compared to the 160 without optimization!Fun fact: compiling the CMSIS DSP with -O2 gives slight better performance than using -O3! (updated the old post)
abet wrote:
Today I did some tests using the -O3 optimization level for my project and the result is great (using thibaut's function with no sign extension): the elapsed time for 128 sample is roughly 18us compared to the 160 without optimization!
Fun fact: compiling the CMSIS DSP with -O2 gives slight better performance than using -O3! (updated the old post)
The -O2 is a great observation. This might be connected to that -O3 most likely unrolls the loops more than -O2.
If that's the case, it means that fetching the code from SPIFI slows down (I'm only guessing here).
If it's possible for you to link to a binary version of a pre-compiled CMSIS DSP library, try that.
I know that the people who have developed the DSP library have spent very much time on optimizing it; like that was the most important thing in thw World for them,.
-So if a precompiled library exists and you can link directly to that, then you'll most likely get the best performance regarding the DSP library.