This is my first post; sorry if it's in the wrong place or if I've unwittingly violated any accepted norms or conventions around here...
My embedded application uses a Cortex-M4f with 128kB of SRAM. I need to do a 4096-point real FFT and I'm trying to use arm_rfft_fast_f32(), which seems to be ideally suited for my needs. My FFT will be a fixed-size, meaning I only need to do a 4096-point transform. I've done everything correctly, including instantiating an arm_rfft_fast_f32_instance. When I build my project (which, at the moment, does little more than the FFT in question), it turns out to be more than 130kB in size. When I look in the linker map file generated by my IDE (Keil), I can see that my arm_rfft_fast_f32_instance comprises quite a few tables of twiddle factors for the FFT. Given that a) I only need the largest one (the 4096-point table) and b) the smaller tables are just "decimated by 2" versions of the larger tables, is there a way to get rid of the unneeded tables? If I could do that, I could cut the memory footprint of this thing by more than 25% and it would fit into SRAM, which is what I really need it to do.
Has anyone else had to deal with this issue? I find it odd that this particular FFT implementation is so incredibly storage-intensive. Is there a "smaller" FFT implementation that doesn't have such a big memory footprint?
Thanks in advance.
Hi Glennz,
ARM Compiler 5 includes various techniques to reduce the code size of the final image. In your case, I would suggest to look at the following:
Let me know if these options help to reduce the code size and fit the image in the SRAM.
Best Regards,Stefano