Hello,
DSP concept guys say, that it's time to use ARM Cortex-M microcontrollers for embedded DSP systems, so I looked at CMSIS library of filtering functions, and found that it is of block type.
As you know, the most painful feature of ARM Cortex-M architecture is the lack of circular buffer addressing mode.
I cannot find an example of this functions application for continuous, real-time signal, because, as I guess, there is a big problem of input samples block gathering in a structure compatible with CMSIS FIR function. This should be done by a DMA controller, as we don't want to loose core clock, and this task is not easy. CMSIS FIR functions has internal state buffer which length equals to block_size+numOfTaps-1.
The function in multiple steps (=block_size/4) makes 4 samples copy from input buffer to state buffer (using core !!!), but after that, before next input block filtering the last numOfTaps-1 samples in state buffer must be moved to the beginning of this buffer.
It looks bad.
Maybe someone of you solved this problem and used this function in a real-time so, please, write me about that.
Kind regards
Roman Rumian
Hello, i could implement it in real time!!! i found the right way.
Kind Regards
Any insight you could provide to that end, instead of just stating it works?
Here is a simple pin-pong buffer example using a biquad. The filter is generated from ASN DSP Filter Designer, and adopted to use ping-pong buffer in real-time:
// ** Primary Filter (H1)** ////Band# Frequencies (kHz) Att/Ripple (dB) // 1 0.000, 2.400 0.001 // 2 12.000, 24.000 80.000 //// // Arithmetic = 'Floating Point (Single Precision)'; // Architecture = 'IIR'; // Structure = 'Direct Form II Transposed'; // Response = 'Lowpass'; // Method = 'Butterworth'; // Biquad = 'Yes'; // Stable = 'Yes'; // Fs = 48.0000; //kHz // Filter Order = 8; // Included Headers #include "SMM_MPS2.h" // MPS2 FPGA board #define ARM_MATH_CM4 // Cortex-M4 (default) #include "arm_math.h" // CMSIS-DSP library header extern uint8_t audio_init(void); // Initialize audio hardware extern void read_sample(int16_t *left, int16_t *right); extern void play_sample(int16_t *left, int16_t *right); #define BLOCKSIZE 32 #define NUM_SECTIONS_IIR 4 // ** IIR Direct Form II Transposed Biquad Implementation ** // y[n] = b0 * x[n] + w1 // w1 = b1 * x[n] + a1 * y[n] + w2 // w2 = b2 * x[n] + a2 * y[n] // IIR Coefficients float32_t iirStatesf32[NUM_SECTIONS_IIR*5]; float32_t iirCoeffsf32[NUM_SECTIONS_IIR*5] = {// b0, b1, b2, a1, a2 // 48KHz sampling 0.0651210, 0.1302419, 0.0651210, 1.0463270, -0.2788444, 0.0649572, 0.1299143, 0.0649572, 1.1071010, -0.3531238, 0.0660114, 0.1320228, 0.0660114, 1.2402050, -0.5158056, 0.0721109, 0.1442218, 0.0721109, 1.4713260, -0.7982877 }; arm_biquad_cascade_df2T_instance_f32 S; // Samples from both audio channels int16_t left_channel_in; int16_t right_channel_in; int16_t left_channel_out; int16_t right_channel_out; // Ping-pong buffer float32_t InputValues_A[BLOCKSIZE]; float32_t OutputValues_A[BLOCKSIZE]; float32_t InputValues_B[BLOCKSIZE]; float32_t OutputValues_B[BLOCKSIZE]; volatile int PingPongState=0; // 0 = processing A, I/O = B // 1 = processing B, I/O = A volatile int Block_Counter=0; // Count sample from 0 to BLOCKSIZE-1, // then trigger processing and return to 0 volatile int processing_trigger=0; // When set to 1, start processing // Set to 2 during processor, return to 0 when processing is done int main(void) { int i; // Initialize data in ping-pong buffer for (i=0;i<BLOCKSIZE;i++) { InputValues_A[i] = 0.0f; OutputValues_A[i] = 0.0f; InputValues_B[i] = 0.0f; OutputValues_B[i] = 0.0f; } // Initialise Biquads arm_biquad_cascade_df2T_init_f32 (&S, NUM_SECTIONS_IIR, &(iirCoeffsf32[0]), &(iirStatesf32[0])); audio_init(); // Initialise the audio inteface // Wait for I2S IRQs to occur and service them. // allows to save power in the meantime while(1){ if (processing_trigger>0) { processing_trigger=2; // indicates biquad is running if (PingPongState==0) { arm_biquad_cascade_df2T_f32 (&S, &InputValues_A[0], &OutputValues_A[0], BLOCKSIZE); // perform filtering } else { arm_biquad_cascade_df2T_f32 (&S, &InputValues_B[0], &OutputValues_B[0], BLOCKSIZE); // perform filtering } // endif-if (PingPongState==0) processing_trigger=0; // return to 0 } // end-if (processing_trigger!=0) __WFE(); // sleep while nothing is need to be sone } } /************************************************************************/ /* I2S audio IRQ handler. Triggers at 48KHz. */ /************************************************************************/ void I2S_Handler(void) { int local_Block_Counter; // Block_Counter counts from 0 to BLOCKSIZE-1 local_Block_Counter = Block_Counter; // Read sample from ADC read_sample(&left_channel_in, &right_channel_in); if (PingPongState==0) { InputValues_B[local_Block_Counter] = (float) left_channel_in; left_channel_out=(int) OutputValues_B[local_Block_Counter]; } else { InputValues_A[local_Block_Counter] = (float) left_channel_in; left_channel_out=(int) OutputValues_A[local_Block_Counter]; } right_channel_out = right_channel_in; // Only left channel is processed // Write sample to DAC play_sample(&left_channel_out, &right_channel_out); local_Block_Counter++; if (local_Block_Counter>=BLOCKSIZE) { // Wrap around and toggle ping-pong buffer local_Block_Counter = 0; PingPongState = (PingPongState+1) & 0x1; // toggle state if (processing_trigger==2) { // Biquad is still running - overrun __BKPT(1); // Error } else { processing_trigger = 1; // Start new biquad } } Block_Counter = local_Block_Counter; // save new Block_Counter return; }
Hope this helps.
regards,
Joseph
Hello !
Thanks for sharing the above solution. My implementation follows the code pretty much exactly, except I have a much lower frequency with a slightly higher order filter.
I'm having trouble recovering the signal from the output arrays to present to the DAC. In the code above how the scaling to ADC in and scaling for DAC out might be in the external functions "read_sample" and "play_sample". This is where I am having an issue.
Essentially - what conditioning is required with the ADC signal *before* placing the sample(s) into the InputValues_A and InputValues_B buffers?
Conversely - what re-scaling or converting is required is required *after* taking sample(s) from the OutputValues_A and OutputValues_B buffers and sending them to the DAC? I understand the fact that ADC is uint16_t DAC is uint32_t, and the arrays are float32_t. But other than the casting for that what needs to be done?
In my case I present a fixed low frequency sine wave into the ADC at the center of my IIR bandpass filter. I am getting a signal out at the input frequency, but it is not scaled correctly and is shifted from the right DAC midpoint of 2048. No matter what I try to coerce or scale the output to, it results in the signal flattening or distorting.
Any feedback would be very appreciated, thank you!