Hi to you all,I've a firmware running on a NXP LPCLink2 (LPC4370: 204 Mhz Cortex M4 MCU) board which basically does this:
My problem is that my code is too slow, and every now and then and overwrite occurs.
Using the DMA I'm saving the ADC data, which I get in Twos complement format (Offset binary is also available), in a uint32_t buffer and try to prepare them for the CMSIS DSP function by converting the buffer into float32_t: here's where the overwrite occurs. It's worth saying that I'm currently using Floating point Software, not hardware.
The CMSIS library also accepts fractional formats like q31_t, q15_t and so on, and since I don't strictly need floating point maths I could even use these formats if that could save me precious time.It feels like I'm missing something important about this step, that's no surprise since this is my first project on a complex MCU, any help/hint/advise would be highly appreciated and would help me in my thesis.
I'll leave here the link for the (more datailed) question I asked in the NXP forums, just in case: LPC4370: ADCHS, GPDMA and CMSIS DSP | NXP Community .
Thanks in advance!
abet wrote:Do you think that trying to rebuild my CMSIS library with a different optimization level will likely be a serious improvement?
abet wrote:
Do you think that trying to rebuild my CMSIS library with a different optimization level will likely be a serious improvement?
It certainly could improve things if the code is currently optimized for size or unoptimized.
If you use a pre-compiled library, then the library is most likely built with optimal performance - however - if the code is executing from Flash memory, I think it might be worth moving it to SRAM.
What I would recommend, is to put the code in a "ramcode" section and optimize for highest speed if you rebuild.
In addition, I would recommend you to run any other time-critical code from SRAM, however, make sure your code resides in a different section of RAM than the section that your DMA will access; this is very important.
Executing code from SRAM will give you a huge performance increase on a LPC40xx.
-But if the DMA and CPU fight over who's going to use the SRAM section, you might end up getting worse performance than before.
So make sure that the two sections are independent.
Jens, Andrea is using LPC-Link 2 which is based on LPC4370, a Flashless MCU. Quad SPI Flash memory is used in this board. Since fastest code execution is sought copying to RAM rather than executing in place is imperative (I presume this is what Andrea is doing).
-As far as I recall, the LPC4xxx is able to execute code directly from SPIFI (please correct me if I'm wrong).
-But even if the code is already running from SRAM, it is a good idea to put the code in one SRAM, the data in another SRAM and the DMA buffers in a third SRAM, so that there are no stalls (collisions).
Yes, the external Flash is memory-mapped and code can be executed directly.
Yes.
Based on Andrea's updates my assumption that the code is running in SRAM may be wrong.