Picture your smart assistant at home: you say a command, and it recognizes your voice, processes what you’re saying and responds. This is an example of a multi-sensor device that requires signal processing. Signal processing technology is critical in all sorts of devices around us today: wearables, audio headsets, smart speakers and cameras. We see spectacular growth in autonomous, intelligent, and connected devices like this, and the challenge is that they must operate in a low-power environment.
Signal processing technology is critical in all these devices around the home
To achieve signal processing functionality, these applications previously used a simple microcontroller (MCU) based on an Arm Cortex-M0 or Cortex-M3 processor together with a separate proprietary, dedicated Digital Signal Processor (DSP). Now, however, we are seeing an increasing number of product manufacturers (or Original Equipment Manufacturers - OEMs) switching to a single, high-performance, low-power MCU with DSP extensions, such as the Cortex-M4, Cortex-M7, Cortex-M33 or Cortex-M35P processor, to replace the two-processor design.
Using an Arm-based combination of MCU and DSP functionality in one processor has some advantages for OEMs, enabling them to:
This blog will cover the signal processing capabilities on the easy-to-use Arm Cortex-M processors and how to take advantage of software support from Arm's ecosystem partners. I will also cover how the architecture of our processors allow efficient implementation of the algorithms and details of the free DSP library from Arm, which includes an example for noise cleaning of electrocardiography signal recording.
Signal processing algorithms are applied to raw data from the analog to digital converters to shape the data to improve the decisions made by the application software. Typical algorithms control the amplitude of the signal, remove the noise or estimate the frequency of oscillation.
The key operations used for signal processing are based on a mathematical operation called discrete convolution. Convolution is created by a sum of products, so any processor able to compute this efficiently in one cycle will result in a sum of products that can be used for signal processing.
Thirty years ago, data processing was limited to 10 million multiplies per second with 16-bit operands, and the address space was limited to a few tens of kBytes. Today, a small Cortex-M3 can be synthesized at much more than 500MHz; it computes 32-bit multiplications, accumulates 64-bits, and it has several gigabytes of address space. While the Cortex-M3 doesn’t have DSP extensions, it can still do signal processing. There is no practical limitation for using Cortex-M devices for complex signal processing computation, and this blog will share some practical examples.
First, let’s take a step back to look at the technology Arm offers and help you understand the best fit for your application. The Arm Cortex family of processors provides a standard architecture to address the broad performance spectrum and cost range required by these diverse product markets. The Arm Cortex family includes processors based on three distinct profiles:
Cortex-A and Cortex-R processors include the NEON SIMD (single instruction, multiple data) extensions that provide high-performance mathematical instructions for signal and data processing.
Cortex-A and Cortex-R processors are used extensively for signal processing applications. This blog focuses on the Cortex-M processor family, so let’s take a look at the range of benefits and performance points offered by Cortex-M processors. Here’s a quick guide to the highlights:
Arm Cortex-M processor portfolio, including those with DSP extensions
The Cortex-M4, Cortex-M7, Cortex-M33 and Cortex-M35P are digital signal controllers that address the need for high-performance generic code processing as well as digital signal processing applications. These processors include DSP extensions to the Thumb instruction set and include the optional floating-point unit (FPU). These instructions are designed to help improve the performance of numerical algorithms and provide the opportunity to perform signal processing operations directly on the CPU. As mentioned above, many years ago, you may have used a Cortex-M3 for signal processing. However, these Cortex-M processors that combine DSP extensions provide far better performance.
Why have a combination of control and DSP all-in-one Arm CPU? Here’s a quick overview:
To get started with Arm-based chips with DSP development, check out Arm's silicon partners. NXP, ST, and Nordic Semiconductor announced Cortex-M33 based chips with DSP functionality last year. Read more on the TrustZone for Armv8-M community page.
The CMSIS-DSP and CMSIS-NN library is a suite of common signal processing and mathematical functions that have been optimized for Cortex-M processors. The library is freely available as part of the CMSIS release and includes all source code. The functions in the library are divided into several categories:
The library has separate functions for operating on 8-bit integers, 16-bit integers, 32-bit integers, and 32-bit floating-point values. You can use the CMSIS-DSP source code, modify it, distribute it, without any constraint to publishing any detail of your software.
Here is an example of signal processing applied to noise removal and signal detection on an electrocardiography recording. A Cortex-M microcontroller captured the ECG physiological data using a 500Hz sampling rate. The data stream was processed through a noise removal algorithm (upper wave below) and a pulse detection was applied on the cleaned version of the data (second wave below).
Noise removal and detection of ECG data
The noise removal suppresses the low-frequency modulation and the 60Hz interference from AC power lines. The detection algorithm finds the peaks in the input stream over a sliding window and determines the start of the heart period using statistical estimation.
The filter uses the following three poles (radian/amplitude): (0.05 [rad] / 0.98); (0.25 [rad] / 0.9); (0.45 [rad] / 0.97). And three zeroes all placed on the Z-circle with angles: 0.02 [rad], 0.65 [rad] and 1 [rad]. The filter gain is 0.02. This filter removes the close-to-DC spectral components and removes the noise around the frequency power line in the 50Hz to 60Hz area.
Filter characteristics from the MathWorks filter design tool
When running on a Cortex-M3 processor, the ECG signal processing consumes less than 0.1MHz of CPU load. More precisely the processing of one second of the signal through a cascade of three bi-quad filters takes 55k cycles/s, and the energy computation and threshold detection take 15k cycles/s, adding some implementation margin and time for buffer copies this approximately 0.1MHz.
I hope this blog demonstrated the benefits of using DSP and control all-in-one CPU from Arm. As markets move more towards streaming, connectivity, and interactive user interfaces, there will be an increasing demand for performance in low-power, embedded devices. Using a single microcontroller with DSP capabilities, rather than a lower performance microcontroller with separate DSP, reduces BoM cost, system-level complexity, software development costs, and timescales.
We expect that an ever-increasing number of consumer devices will benefit from the high-performance, low-power and low-latency response of the Cortex-M4, Cortex-M7, Cortex-M33 and Cortex-M35P processors from the Cortex-M family. Combine those with Arm’s free software libraries to get a head start on DSP development.
Access the free CMSIS-DSP software library
Nice example! You said that the example consumes 0.1 MHz in a Cortex M3, but why are you mentioning an M3 if they don't have DSP extensions?