DSP concept guys say, that it's time to use ARM Cortex-M microcontrollers for embedded DSP systems, so I looked at CMSIS library of filtering functions, and found that it is of block type.
As you know, the most painful feature of ARM Cortex-M architecture is the lack of circular buffer addressing mode.
I cannot find an example of this functions application for continuous, real-time signal, because, as I guess, there is a big problem of input samples block gathering in a structure compatible with CMSIS FIR function. This should be done by a DMA controller, as we don't want to loose core clock, and this task is not easy. CMSIS FIR functions has internal state buffer which length equals to block_size+numOfTaps-1.
The function in multiple steps (=block_size/4) makes 4 samples copy from input buffer to state buffer (using core !!!), but after that, before next input block filtering the last numOfTaps-1 samples in state buffer must be moved to the beginning of this buffer.
It looks bad.
Maybe someone of you solved this problem and used this function in a real-time so, please, write me about that.
many thanks for the kind answer.
Anyway, this function loses time for unnecessary transfers.
The first one is the copy of samples (four samples at time), from input buffer do FIR state buffer. Why not use last blockSize cells in the state buffer for this purpose (as an input buffer) ??
Possible conflict between CPU and DMA may be eliminated by using two, ping-pong state buffers. If one will use another DMA channel to move last N samples from stateA buffer to the beginning of stateB one, many core clock cycles may be saved for FIR convolutions.
I plan to modify this way CMSIS functions, and will inform you about results.