Cortex M7 DSP moving average UMAAL

Hello,

I am looking to do a moving average function using DSP instructions of ARM Cortex M7. Unfortunately I couldn't find a direct example. My goal is to have variables for

- the sum

- the new value

- the oldest value

Then the algorithm is sum = sum + new value - oldest value and average = sum / ( number of values between oldest and new), ie two instructions

I think instruction UMAAL could be good for this, I found it in instruction set summary, but it is not in CMSIS library. Why is that? Where can I find details about it and how can I create my own C asm caller?

Thanks for any help and hints or other ideas on this topic, I hope it also helps other developers.

Kind regards

Martin

(I posted this also in Processor discussions, because I missed the embedded forum. Please delete it there, thanks)

  • Hi Martin,

    It's not clear why you believe UMAAL would be of use in this scenario; full details of the instruction can be found in the Armv7-M Architecture Manual here: https://developer.arm.com/products/architecture/cpu-architecture/m-profile/docs/ddi0403/latest

    I would anticipate moving average code to be something like:


    #include <stdint.h>
    
    #define WINDOW 16
    
    typedef uint16_t sample_t;
    typedef uint32_t accum_t;
    
    sample_t samples[WINDOW];
    unsigned int position;
    
    void add_sample(sample_t value)
    {
      if(position >= WINDOW)
        position = 0;
    
      samples[position++] = value;
    }
    
    sample_t get_average(void)
    {
      accum_t accumulator = 0;
      unsigned int i;
    
      for(i=0; i<WINDOW; i++)
        accumulator += samples[i];
    
      return accumulator / WINDOW;
    }

    The only place I might anticipate a long multiply being used are for situations where both accum_t is 32-bits in size, and where WINDOW isn't a power of two, at which point an optimising compiler may choose to perform the division via long multiplication by the divisors reciprocal.

    Best regards

    Simon.

  • Optimised as you describe, yields the same division, where an optimising compiler should choose the optimal implementation based on it being able to deduce that WINDOW is a constant, and the particular characteristics of the operand types;

    #include <stdint.h>
    
    #define WINDOW 73
    
    typedef uint16_t sample_t;
    typedef uint32_t accum_t;
    
    sample_t samples[WINDOW];
    accum_t sum;
    unsigned int position;
    
    sample_t add_and_get_average(sample_t sample)
    {
      if(position >= WINDOW)
        position = 0;
    
      sum -= samples[position];
      sum += sample;
      samples[position] = sample;
      position++;
    
      return sum / WINDOW;
    }

    Simon.