I have some questions about correct use of the CMSIS DSP library call arm_fir_32. First, I'll provide some background about what I am doing and what the setup is.
I have a STM32F4 Discovery board, using IAR EWARM for programming. Just for testing purposes, I'm generating a low frequency test signal at 40Hz and feeding it into one of the ADC inputs. The signal is biased to swing from 0 to about 2.5Vpp. The signal has a low to moderate amount of broadband noise - but at this point I am not purposely mixing or introducing any other signals with it. There is a timer interrupt set to sample frequency of 2KHz, with a sampling buffer of 2048 samples.
I have already tested and am using the FFT function arm_cfft_f32, and can accurately determine (track) the frequency of the incoming signal when I change it at the source. This seems to be working well.
Now, I would like to use the arm_fir_32 filter. To do this, I started out reading the documentation from CMSIS on the function. To implement a FIR low pass, to generate the tap coefficients, I am using this website's only tool to do so.
I generated a 4th order filter, and set the sampling rate the same as my software, with a cutoff of 60Hz. I forced generation to 24 taps to be even. So the BLOCK_SIZE is 32, and the number of blocks is 1024/32 = 32.
Following the example from CMSIS on this function, I believe I've set up correctly. So the chain looks like this:
ADC --> FIR --> FFT
However, I'm not getting the result I would expect. The values returned from the FFT's output buffer are exponentially large (not this way if I comment out /circumvent the FIR calls). This leads me to believe I am missing a step. Do I need to normalize the values? I thought that because I input the rate into the FIR function setup, this wouldn't be required - but maybe this is incorrect.
Can someone please provide some insight or assistance as to what I am missing or doing incorrectly to apply the FIR processing?
Hi Dr. Beckmann,
The code is posted below. It runs in the Timer interrupt, defined by:
TIM_TimeBaseStructure.TIM_Period = ( (RCC_Clocks.PCLK2_Frequency / SAMPLE_FREQ) - 1);
Here's where the issue seems to be. It works perfectly with frequency precision to 0.5Hz, or better, if I use these defines:
#define SAMPLE_FREQ (2000)
#define INPUT_SAMPLES (4096)
Here are the bins at/around maximum when running with the above defines:
cfftBuffer = 49.8
cfftBuffer = 167.3
cfftBuffer = 1024.9
cfftBuffer = 503.9
cfftBuffer = 226.6
cfftBuffer = 142.8
cfftBuffer = 96.9
Frequency calc: 55 * (2000/4096) = 26.9 (input generator at 27Hz)
If I use the defines as below in the sample code, it does not compute the correct frequency; the maximum bins do not seem to be in the correct place. Here are the bins around the maximum:
(sorry about the previous post, I pasted the incorrect bins from another run)
cfftBuffer = 27.7
cfftBuffer = 13.5
cfftBuffer = 28.4
cfftBuffer = 563.1
cfftBuffer = 156.8
cfftBuffer = 91.2
cfftBuffer = 67.1
12 * (1000/2048) = 5.9 (input generator at 27Hz)
BTW - I did try the decimation method as you suggested in your post (just used a 4 point moving average filter) but got similar bad results.
Thank you again for assisting me with this, very appreciated. (just hope I'm not missing something obvious)
/***** CODE BELOW ***/
#define SAMPLE_FREQ (uint32_t)1000
#define INPUT_SAMPLES (uint32_t)2048
#define FFT_SIZE (uint32_t)(INPUT_SAMPLES/2)
#define MAX_FREQ (float32_t)50.0
/* max energy bin index. Divide sample rate by sample size,
then multiply by max frequency and cast to int */
#define MAX_ENERGY_INDEX (uint32_t)((INPUT_SAMPLES/SAMPLE_FREQ)*MAX_FREQ)
/* Timer 6 Interrupt Handler */
if( TIM_GetITStatus(TIM6, TIM_IT_Update) )
volatile uint16_t uhADCxConvertedValue;
float32_t smplRate = 0;
uhADCxConvertedValue = ADC_get();
//subtract small offset from sigGen
ADC_Values[sampleCount] = ( ((uhADCxConvertedValue * VREF) / ADC_COUNTS) - 0.58);
if(sampleCount++ == INPUT_SAMPLES)
inputBufPtr = &ADC_Values;
outputBufPtr = &outputBuffer;
arm_fir_init_f32(&S, FILTER_TAP_NUM, (float32_t *)&filter_taps, &firStateF32, BLOCK_SIZE);
for(uint16_t i = 0; i < NUM_BLOCKS; i++)
arm_fir_f32(&S, inputBufPtr + (i * BLOCK_SIZE), outputBufPtr + (i * BLOCK_SIZE), BLOCK_SIZE);
if( FFT_SIZE == 1024 )
arm_cfft_f32(&arm_cfft_sR_f32_len1024, outputBuffer, ifftFlag, doBitReverse);
arm_cfft_f32(&arm_cfft_sR_f32_len2048, outputBuffer, ifftFlag, doBitReverse);
/* Calculate the magnitude at each bin */
arm_cmplx_mag_f32(outputBuffer, cfftBuffer, FFT_SIZE);
cfftBuffer = 0;
maxValue = 0;
//find maximum bin (CMSIS call doesn't appear to work correctly!)
for(uint16_t i = 0; i < MAX_ENERGY_INDEX; i++)
if( cfftBuffer[i] > maxValue )
maxValue = cfftBuffer[i];
index = i;
//compute frequency at max bin
smplRate = (INPUT_SAMPLES / 1.0);
float32_t frequency = ((index * SAMPLE_FREQ) / smplRate);
printf("Frequency = %.1f, bin energy[%i] = %.1f\n", frequency, index, maxValue );
sampleCount = 0;
/* Clear TIM6 Capture compare interrupt pending bit */
/* end interrupt handler */
(Just a minor comment, before I go on vacation tomorrow, I hope you don't mind me chipping in).
Though I do not know much about FFTs, I will recommend clearning the pending bit in the beginning of your interrupt.
It's always good to clear it as early as possible, so you reduce the risk of missing out an interrupt.
It could affect your readings, depending on how fast your MCU is running.
Also, there is an unwritten rule, a no-no; do-not-do... Never call library functions (such as printf for instance) from an interrupt service routine.
Do not mind a all! I appreciate your comments. 100% agree on clearing the interrupt, have already changed that. The printf is totally for debug through SWO,
and yes it does impact the performance of the IH to a large degree, it will be removed after debugging.
Happy USA day, jensbauer.