This article is a follow-on to Navigating the Cortex Maze. As a high-level overview, the earlier article provides an easy way-in to the ARM processor range. It covers Cortex-A (architecture ARMv7-A), Cortex-R (ARMv7-R) and Cortex-M (ARMv7-M).
But the Cortex-M story has much more depth than that and warrants some further explanation. You see, it doesn’t end with ARMv7-M…
The Cortex-M3 processor, the first of the ARMv7 cores, was released in 2004. It supported the new architecture profile ARMv7-M. As I explained in the earlier piece, this architecture was targeted at microcontrollers and incorporated a number of changes aimed at low-cost, low-footprint devices enabling a high degree of standardization across multiple vendors. It has been astonishingly successful. To the end of 2012, over 2B Cortex-M microcontroller devices have shipped.
As usual, one size doesn’t fit all requirements. The microcontroller space is huge and very diverse. At the top end, devices need to carry out significant amounts of arithmetic and numerical data processing in real-time and need the instruction set capability to support that; at the other end of the scale, while lots of processing might not be required, the driver is for extreme low-cost and small-size, coupled with maximum power efficiency.
Good though ARMv7-M (and the Cortex-M3) is, it can’t address all these requirements in a single architecture profile.
So, today, we find there are four devices in the Cortex-M range, supporting two distinct architectures and four incremental instruction sets.
ARMv7-M supports only the Thumb-2 instruction set. Coupled with a simple two-mode programmer’s model, privilege (which does not have to be used at run-time) and a very simple register banking scheme, this makes for very compact and efficient designs. This is the architecture of the Cortex-M3 and it is applicable to a large range of microcontroller applications.
ARMv7E-M is currently supported by the Cortex-M4 processor. It builds on ARMv7-M adding a set of saturating and SIMD instructions. This (called the “DSP extension”) significantly increases the capability of the core in DSP applications. These instructions cover operations like signed and unsigned saturated arithmetic, byte and halfword packing/unpacking, dual 16-bit and quad 8-bit operations, and extended halfword multiply-accumulate.
There is a further variant of the Cortex-M4 which includes the “Floating-point extension”. Incorporating a further group of instructions for single-precision floating point, these are supported by the addition of an FPU to the standard Cortex-M4. These instructions operate on an extended register bank of 32 single-precision registers and provide single-precision floating point arithmetic, comparison, data transfer between the extension registers, core registers and memory. Of course, the standard Cortex-M4, without the FPU extension, can still handle floating point arithmetic in software but this will take longer and require extra code.
The remaining members of the Cortex-M family support a slightly different architecture – ARMv6-M. These are Cortex-M0, Cortex-M0+ and Cortex-M1. The first of these, Cortex-M1, was released in 2007 and is designed for FPGA applications.
Figure 1a and ab: Sponza with clustered volumetic fog
ARMv6-M is aimed at the very low end of the 32-bit microcontroller space, enabling very low gate-count designs with very simple and highly efficient microarchitecture. Several features of ARMv7-M are removed to enable this simplicity. The following is a summary of the major changes:
From the descriptions above, it should be apparent that each architecture is a superset of those below it in the range. This is illustrated by the diagram below which shows instruction set support from Cortex-M0 to Cortex-M4 FPU.
Hopefully, you can now see that the structure of the Cortex-M microprocessor product range encourages a high degree of standardization across vendors, applications and performance points. To build on that, ARM has worked with the industry to define the Cortex Microcontroller Software Interface Standard (CMSIS). This is a consistent and simple software interface to the processor for peripherals, real-time operating systems and middleware. The goal is to simplify and maximize software reuse. CMSIS reduces the learning curve and shortens time to market.
a very good summary on M which corrects my misunderstanding on that M3 is v6m. Thanks!