This discussion has been locked.

You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Using CMSIS DSP Function: arm_fir_f32 - Correct Use?

Hi All;

I have some questions about correct use of the CMSIS DSP library call arm_fir_32. First, I'll provide some background about what I am doing and what the setup is.

I have a STM32F4 Discovery board, using IAR EWARM for programming. Just for testing purposes, I'm generating a low frequency test signal at 40Hz and feeding it into one of the ADC inputs. The signal is biased to swing from 0 to about 2.5Vpp. The signal has a low to moderate amount of broadband noise - but at this point I am not purposely mixing or introducing any other signals with it. There is a timer interrupt set to sample frequency of 2KHz, with a sampling buffer of 2048 samples.

I have already tested and am using the FFT function arm_cfft_f32, and can accurately determine (track) the frequency of the incoming signal when I change it at the source. This seems to be working well.

Now, I would like to use the arm_fir_32 filter. To do this, I started out reading the documentation from CMSIS on the function. To implement a FIR low pass, to generate the tap coefficients, I am using this website's only tool to do so.

I generated a 4th order filter, and set the sampling rate the same as my software, with a cutoff of 60Hz. I forced generation to 24 taps to be even. So the BLOCK_SIZE is 32, and the number of blocks is 1024/32 = 32.

Following the example from CMSIS on this function, I believe I've set up correctly. So the chain looks like this:

ADC --> FIR --> FFT

However, I'm not getting the result I would expect. The values returned from the FFT's output buffer are exponentially large (not this way if I comment out /circumvent the FIR calls). This leads me to believe I am missing a step. Do I need to normalize the values? I thought that because I input the rate into the FIR function setup, this wouldn't be required - but maybe this is incorrect.

Can someone please provide some insight or assistance as to what I am missing or doing incorrectly to apply the FIR processing?

Thank you,

Gary

Top replies

Gary over 9 years ago +1 verified

No response = close thread. Thanks for those who were polite enough to help and respond.

0 Dr. Paul Beckmann over 9 years ago

Hi Gary,
The CMSIS FIR function does not have any unusual scaling that you need to follow. Since you are generating a lowpass filter you can easily determine if the filter is scaled properly by determining its DC response. Simply sum up the coefficients in the filter and the result should be close to 1.0. One thing to keep in mind is that you have a relatively short FIR filter. At 24 points with a sample rate of 2 kHz, your frequency resolution is 83 Hz (2000/24) and you thus won't be able to make changes in the frequency domain at a finer resolution than this.
Could you share your code and we'll take a quick look?
-Paul
Cancel
Up 0 Down

Cancel

0 Gary over 9 years ago in reply to Dr. Paul Beckmann

Please, accept my apology for my previous post, it was uncalled for. Just a bit impatient, I am.

Paul, I determined the error and your response helped me to do that. Somehow the FIR coefficients were getting trashed the way I was declaring them and thus resulting in all the trash I was seeing. Now, getting

values that I would have expected. I've created another set of filter coefficients. o to 45Hz, G = 1, 50Hz to 1KHz, G = 0. 1.5dB ripple, -80dB attenuation, 978 taps.

Could you please elaborate a bit further on your comment about the resolution - if I were to use that calculation, it would now be 2000/978 = 2.05. Are you saying this is the lowest resolution of frequency I can accurately ascertain after processing it through the FFT?

Why does the number of taps affect the resolution of the FFT which is another function? If so, how can I attain higher frequency resolution without changing sampling frequency, which is at least 40x higher that the highest frequency I am processing (50Hz) ?

NB - I should have added that I am trying to obtain very high frequency resolution, sub-hertz if possible.

Again, my apologies and I thank you for taking the time to answer my questions.

As requested here is sample code that I am using, in the interrupt handler, which is set to run at SAMPLE_FREQ:

#define SAMPLE_FREQ	(uint32_t)2000
#define INPUT_SAMPLES	(uint32_t)1024
#define ADC_COUNTS	(uint32_t)4096
#define VREF	(float32_t)3.2

void TIM6_DAC_IRQHandler(void)

{

if( TIM_GetITStatus(TIM6, TIM_IT_Update) )

{

volatile uint16_t uhADCxConvertedValue;

uint32_t index;

float32_t *inputBufPtr;

float32_t *outputBufPtr;

float32_t smplRate = 0;

if( sampleCount == 0 )

{

arm_fill_f32(0.0, ADC_Values, INPUT_SAMPLES);

}

uhADCxConvertedValue = ADC_get();

/* subtract the DC offset that was added as this is not a bipolar ADC */

ADC_Values[sampleCount] = ( ((uhADCxConvertedValue * VREF) / ADC_COUNTS) - 1.6);

if(sampleCount++ == INPUT_SAMPLES)

{

inputBufPtr = &ADC_Values[0];

outputBufPtr = &outputBuffer[0];

arm_fir_instance_f32 S;

arm_fir_init_f32(&S, FILTER_TAP_NUM, (float32_t *)&filter_taps[0], &firStateF32[0], BLOCK_SIZE);

for(uint16_t i = 0; i < NUM_BLOCKS; i++)

{

arm_fir_f32(&S, inputBufPtr + (i * BLOCK_SIZE), outputBufPtr + (i * BLOCK_SIZE), BLOCK_SIZE);

}

/* Process the data through the CFFT/CIFFT module */

arm_cfft_f32(&arm_cfft_sR_f32_len512, outputBuffer, ifftFlag, doBitReverse);

/* Calculate the magnitude at each bin */

arm_cmplx_mag_f32(outputBuffer, cfftBuffer, fftSize);

/* get the maximum energy at index */

arm_max_f32( cfftBuffer, BLOCK_SIZE, &maxValue, &index );

printf("Max Value = %.1f, at index = %i\n", maxValue, index);

//compute frequency

smplRate = (INPUT_SAMPLES / 1.0);

printf("frequency = %.1f\n", (float32_t)((index * SAMPLE_FREQ) / smplRate) );

sampleCount = 0;

}

/* Clear TIM6 Capture compare interrupt pending bit */

TIM_ClearITPendingBit(TIM6, TIM_IT_Update);

}

0 Dr. Paul Beckmann over 9 years ago in reply to Gary

Hi Gary,
This is a rough rule of thumb when it comes to FIR filters. When you compute sample rate divided by FIR length ( 2000/978 = 2.05) you get a rough metric for how accurately you can adjust the frequency response of the FIR filter. You can make changes in the frequency response at roughly 2 Hz spacing. This is broad rule of thumb I use to determine if the filter is long enough in length.
The FIR filter and FFT have separate lengths, but a similar rule of thumb. For the FFT, the frequency resolution is sample rate / FFT size.
Do you need high resolution across the entire frequency band or just at low frequencies?
-Paul
Cancel
Up 0 Down

Cancel
0 Jens Bauer over 9 years ago

It's good to see another Cortex-M developer, welcome to the community. I believe you've come to the right place.
I would like to answer this question, if I could. Unfortunately I'm absolutely no expert in the DSP-area.
Many of us are not online frequently, and might miss out the question, plus I believe that very few people are experienced with the use of the DSP library's FFT functions.
That sums up to that it'll sometimes take a while for deep-technical questions to be answered.
As you've already seen by now, pbeckmann is an expert in this area.
-If you know any ARM developers, please let them know about this community; the more developers, the quicker we'll get replies to advanced questions.
Cancel
Up 0 Down

Cancel
0 Gary over 9 years ago in reply to Dr. Paul Beckmann

Paul (Dr.Beckmann)
I only require resolution at low frequencies: 10Hz to 50Hz
I'm confident the FIR has adequate taps for the filtering required. I'm still trying to understand what the impact of putting in a 2 Hz resolution filter ahead of the FFT, which has a (2000/2048) resolution, does to the samples. Does it mean that the distribution of error (of 2Hz) should fall around some Gaussian mean? i.e., 30 Hz could be 31, or 29, but likely 30.4 or 29.6 for approximately 70% of the samples?
After doing a bit more research, I've found that it might be an improvement to use a Windowing function. Do you agree? It appears that CMSIS-DSP has no Window functions that I can see. From what I understand, the Hann would be better to eliminate or reduce the buffer endpoints as they may cause noise that would create the jitter. Can I implement the Window by processing each sample as: y[x]=0.5-0.5cos(2pi n / N-1) ? If so, I've tried this, but it is making the results bad. Should the Window be after the cfft?
One further question: the CMSIS function arm_cfft_f32 returns a buffer which you can then search for a peak sample n (n > 0) and then calculate the frequency. What function is available that will give me the indices of the first N peaks? Or is this a function left for the user to write?
Again, I appreciate your generous time and assistance.
Gary
Cancel
Up 0 Down

Cancel
0 Gary over 9 years ago in reply to Jens Bauer

Hi jensbauer,
Thank you.
I've been on the ARM Cortex platform for just over 2 years now. My first was the M3 (LPC1768) which is a nice little processor. I say "little" carefully as it is down a few notches from the STM32F4 that I'm now using. There's been a few hiccups with some of the software under CMSIS, but I couldn't be happier with the ARM Cortex performance. I run the processors hard to get all I can, using FreeRTOS, Visual State and lots of HW interrupts for my projects. The NVIC feature in ARM Cortex has made a world of difference in deterministic operation with respect to timing.
To improve what I'm now working on requires the evolution to DSP processing. So far, so good. However, iIt is difficult to find someone who can provide some insight on the CMSIS DSP library, and I am very grateful that Dr.Beckmann has been kind enough to answer my questions. Hopefully others that will read these posts can benefit from his answers.
Regards,
Gary
Cancel
Up 0 Down

Cancel
0 Jens Bauer over 9 years ago in reply to Gary

I'm working a lot with the LPC series Cortex-M processors myself.
Though I have the LPC43xx, my favorite is still the LPC1768.
This is because it's packed with so much goodies that it can handle almost any kind of job.
But most of the time, I select the smallest possible MCU that will fit my needs.
I tend to lean towards writing assembly code, if I can get away with it.
Though it may be in different ways, we share the desire to push a device to its limits - that's the fun part anyway.
(I too have a STM32F4; the Discovery Board, but I'm still working on getting my IDE compatible).
Cancel
Up 0 Down

Cancel
0 Dr. Paul Beckmann over 9 years ago in reply to Gary

Hi Gary,
My comment regarding the "2 Hz resolution of the FIR filter" just refers to how accurately you can specify its frequency response. It can be just about any shape (e.g., low pass at 40 Hz, low pass at 80 Hz). However you can't have the filter have gain g0 at 10 Hz and then a widely different gain at 11 Hz. I think its safe for what you are doing.
Its a good idea to use a window (Hanning is a good choice) when computing FFTs. If you don't do this you'll find that the result depends heavily on what is happening at the edges of the signal. If you only care about low frequencies and you want very good resolution, I would recommend taking a long data set at the 2 kHz sample rate, filter by a low pass filter with cutoff frequency 100 Hz and then decimating by a factor of 10. This would reduce your signal length by a factor of 10 and reduce the sample rate to 200 Hz. If you computed an FFT then, say with 1024 points, then the frequency resolution would now be 200/1024 = 0.2 which is much better than what you had before.
Sorry, no function in CMSIS which finds the N largest peaks. But that shouldn't be too hard to write.
Cancel
Up 0 Down

Cancel
0 Gary over 9 years ago in reply to Dr. Paul Beckmann

Paul,
OK, thank you for that, no more questions. 8--)
I think it's working well enough now, where I can be happy with consistent 1Hz resolution.
Cancel
Up 0 Down

Cancel
0 Gary over 9 years ago in reply to Dr. Paul Beckmann

Hi Dr.Beckmann,
I am processing low frequency signals (for now, just a sine wave) under 50 Hz and using a low pass filter with Fc at 100Hz. Running samples through a simple Parks-McClellan FIR with 351 taps.
the filter was created for Fs = 1000Hz.
Input signal = 27Hz.
Fs = 1000Hz
Buffer size = 4096
N (fft length) = 2048
According to your post, the resolution should be: 1000/2048 = 0.49Hz
The largest bin should be the fundamental, which is at 24 in this case. After returning from the CMSIS call to arm_cmplx_mag_f32, here are the bins around it:
22: cfftBuffer = 42.4
23: cfftBuffer = 96.9
24: cfftBuffer = 470.8
25: cfftBuffer = 218.8
26: cfftBuffer = 88.3
If: f = bin# x (Fs / N) , then: f = 24 * (1000/2048) = 11.71Hz (should be close to 27Hz)

That doesn't work out anywhere near what I am sending into the ADC, or the calculation. Where am I going astray?
Thanks again for looking at my questions...
Cancel
Up 0 Down

Cancel
0 Dr. Paul Beckmann over 9 years ago in reply to Gary

Hi Gary,
Your analysis looks correct. The frequency resolution is 0.49 Hz per FFT bin. The input sine wave at 27 Hz should appear at 27/0.49 = about bin 55. Something else must be going on. Could you post your code so that I can take a quick look at it?
-Paul
Cancel
Up 0 Down

Cancel
0 Gary over 9 years ago in reply to Dr. Paul Beckmann

Hi Dr. Beckmann,
The code is posted below. It runs in the Timer interrupt, defined by:
TIM_TimeBaseStructure.TIM_Period = ( (RCC_Clocks.PCLK2_Frequency / SAMPLE_FREQ) - 1);
Here's where the issue seems to be. It works perfectly with frequency precision to 0.5Hz, or better, if I use these defines:
#define SAMPLE_FREQ (2000)
#define INPUT_SAMPLES (4096)
Here are the bins at/around maximum when running with the above defines:
cfftBuffer[53] = 49.8
cfftBuffer[54] = 167.3
cfftBuffer[55] = 1024.9
cfftBuffer[56] = 503.9
cfftBuffer[57] = 226.6
cfftBuffer[58] = 142.8
cfftBuffer[59] = 96.9
Frequency calc: 55 * (2000/4096) = 26.9 (input generator at 27Hz)
If I use the defines as below in the sample code, it does not compute the correct frequency; the maximum bins do not seem to be in the correct place. Here are the bins around the maximum:
(sorry about the previous post, I pasted the incorrect bins from another run)
cfftBuffer[9] = 27.7
cfftBuffer[10] = 13.5
cfftBuffer[11] = 28.4
cfftBuffer[12] = 563.1
cfftBuffer[13] = 156.8
cfftBuffer[14] = 91.2
cfftBuffer[15] = 67.1
12 * (1000/2048) = 5.9 (input generator at 27Hz)
BTW - I did try the decimation method as you suggested in your post (just used a 4 point moving average filter) but got similar bad results.
Thank you again for assisting me with this, very appreciated. (just hope I'm not missing something obvious)
Gary
/***** CODE BELOW ***/
#define SAMPLE_FREQ                     (uint32_t)1000
#define INPUT_SAMPLES                   (uint32_t)2048
#define FFT_SIZE                            (uint32_t)(INPUT_SAMPLES/2)
#define MAX_FREQ                        (float32_t)50.0
/* max energy bin index. Divide sample rate by sample size,
then multiply by max frequency and cast to int */
#define MAX_ENERGY_INDEX                (uint32_t)((INPUT_SAMPLES/SAMPLE_FREQ)*MAX_FREQ)
float32_t ADC_Values[INPUT_SAMPLES];
float32_t outputBuffer[INPUT_SAMPLES];
float32_t cfftBuffer[INPUT_SAMPLES/2];
/* Timer 6 Interrupt Handler */
void TIM6_DAC_IRQHandler(void)
{
if( TIM_GetITStatus(TIM6, TIM_IT_Update) )
{
    volatile uint16_t uhADCxConvertedValue;
    float32_t *inputBufPtr;
    float32_t *outputBufPtr;
    float32_t smplRate = 0;
    uint32_t index;

    uhADCxConvertedValue = ADC_get();
    //subtract small offset from sigGen
    ADC_Values[sampleCount] = ( ((uhADCxConvertedValue * VREF) / ADC_COUNTS) - 0.58);

    if(sampleCount++ == INPUT_SAMPLES)
    {
        inputBufPtr = &ADC_Values[0];
        outputBufPtr = &outputBuffer[0];
        arm_fir_instance_f32 S;
        arm_fir_init_f32(&S, FILTER_TAP_NUM, (float32_t *)&filter_taps[0], &firStateF32[0], BLOCK_SIZE);

        for(uint16_t i = 0; i < NUM_BLOCKS; i++)
        {
            arm_fir_f32(&S, inputBufPtr + (i * BLOCK_SIZE), outputBufPtr + (i * BLOCK_SIZE), BLOCK_SIZE);
        }

        if( FFT_SIZE == 1024 )
            arm_cfft_f32(&arm_cfft_sR_f32_len1024, outputBuffer, ifftFlag, doBitReverse);
        else
            arm_cfft_f32(&arm_cfft_sR_f32_len2048, outputBuffer, ifftFlag, doBitReverse);

        /* Calculate the magnitude at each bin */
        arm_cmplx_mag_f32(outputBuffer, cfftBuffer, FFT_SIZE);

        cfftBuffer[0] = 0;
        maxValue = 0;
       //find maximum bin (CMSIS call doesn't appear to work correctly!)
        for(uint16_t i = 0; i < MAX_ENERGY_INDEX; i++)
        {
          if( cfftBuffer[i] > maxValue )
          {
            maxValue = cfftBuffer[i];
            index = i;
          }
        }

         //compute frequency at max bin
         smplRate = (INPUT_SAMPLES / 1.0);
         float32_t frequency = ((index * SAMPLE_FREQ) / smplRate);
         printf("Frequency = %.1f, bin energy[%i] = %.1f\n", frequency, index, maxValue );
        sampleCount = 0;
    }

    /* Clear TIM6 Capture compare interrupt pending bit */
    TIM_ClearITPendingBit(TIM6, TIM_IT_Update);
}
}
/* end interrupt handler */
Cancel
Up 0 Down

Cancel
0 Jens Bauer over 9 years ago in reply to Gary

(Just a minor comment, before I go on vacation tomorrow, I hope you don't mind me chipping in).
Though I do not know much about FFTs, I will recommend clearning the pending bit in the beginning of your interrupt.
It's always good to clear it as early as possible, so you reduce the risk of missing out an interrupt.
It could affect your readings, depending on how fast your MCU is running.
Also, there is an unwritten rule, a no-no; do-not-do... Never call library functions (such as printf for instance) from an interrupt service routine.
Cancel
Up 0 Down

Cancel
0 Gary over 9 years ago in reply to Jens Bauer

Do not mind a all! I appreciate your comments. 100% agree on clearing the interrupt, have already changed that. The printf is totally for debug through SWO,
and yes it does impact the performance of the IH to a large degree, it will be removed after debugging.
Happy USA day, jensbauer.
gary
Cancel
Up 0 Down

Cancel
0 Gary over 9 years ago in reply to Dr. Paul Beckmann

Hi Dr. Beckmann,
Have you had the chance to look at the code I posted to see where things are going astray?
Right now I'm convinced that something weird is going on when I use a 1KHz sampling rate and an FFT of length 1024 (2048 samples). The bin values are not correct, and it does not appear to be a scaling issue. There is no uniformly divisible number that works out in the bins of interest.
Thanks...
Cancel
Up 0 Down

Cancel