Cortex-M0+ a year after: smaller, thriftier and smarter!

September 11, 2013

2 minute read time.

As usual it happened late on Friday afternoon. A couple of weeks ago a message arrived in my inbox from one of our latest ARM® Cortex™-M0+ partners: "We're using 90LP and a similar configuration to your "min" with just a couple of additional and relatively small options and we can't match your reported dynamic consumption (11.2µW/MHz). We can't figure out what's wrong, can you please help to find out what we may have missed?"

For a fraction of a second I wondered whether I should pretend I was already gone and come back to them on Monday? Well...no, I wanted to get to the bottom of it, and we start exchanging mails: How many tracks for the cell library? Target synthesis frequency? Which precise processor options? And after each exchange, I got even more confused. So focused on the details I had missed the right question from the start: "How far are you over the 11.2µW/MHz?"; answer "Well in fact we are 7% below your figures, around 10.4µW/MHz, and we were not expecting to match your marketing values, even less to be lower. Could it be some parts of the processor are not synthesized or clocked?"

It was now my turn to make a late Friday afternoon call to our implementation manager: "Let me check and try some experiments, I should have something for you on Monday".

Back in office first thing on Monday morning, a message from Jonathan "Flycatcher new PPA @ 90LP"... hum someone was working late on Sunday evening. Reading quickly through the text: "new baseline flow 2013.03 ... relaxed max trans... tightened up the floor plan... better utilization.... additional area/power recovery step... ban use of high drive cells.... et voila: 9.828µW/MHz!". Even better it sounds like with some more work we can probably do even better.

So indeed, using a more recent flow and spending a little more time on the routing gives a much better result that our trials run before launch just over a year ago. Here is the PPA comparison using TSMC's 90LP with a 7-Track RVt lib at 50MHz, fully routed, extracted and STA'ed:

The Cortex-M0+ CoreMark performance also improved few weeks ago, raising from the 1.77 CoreMark/MHz as of the launch in March 2012 to now 2.15CoreMark/MHz using the latest ARMcc v5.03 (see footnote): a healthy 21.5% increase.

Beyond the intrinsic processor Power Performance Area (PPA), it is important to remember that Cortex-M0+ was designed from the beginning to significantly reduce the number of memory accesses to optimize energy consumption at system level; in general memories, both SRAM and FLASH, are even more energy hungry than the processor itself. Thanks to its 2-stage pipeline and additional smart optimizations it has the lowest instruction fetch activity across the Cortex-M family.

So even more than a year ago when we introduced Flycatcher, aka Cortex-M0+, the ARM Partnership can enjoy the most energy-efficient and size optimized embedded processor. And if you can beat Jonathan's PPA feel free to give to drop us a mail, even late on Friday afternoon!

CoreMark 1.0 : 21.46 /ARM C Compiler 5.03 [Build 24] -O3 --loop_optimization_level=2 -Otime -DMICROLIB --library_type=microlib --cpu=cortex-m0 / FPGA platform, Code in SRAM - Data in SRAM, memory and CPU clocked @10MHz

Architectures and Processors blog

Scalable Matrix Extension: Expanding the Arm Intrinsics Search Engine

Chris Walsh

Arm is pleased to announce that the Arm Intrinsics Search Engine has been updated to include the Scalable Matrix Extension (SME) intrinsics, including both SME and SME2 intrinsics.
- October 3, 2025
Arm A-Profile Architecture developments 2025

Martin Weidmann

Each year, Arm publishes updates to the A-Profile architecture alongside full Instruction Set and System Register documentation. In 2025, the update is Armv9.7-A.
- October 2, 2025
When a barrier does not block: The pitfalls of partial order

Wathsala Vithanage

Acquire fences aren’t always enough. See how LDAPR exposed unsafe interleavings and what we did to patch the problem.
- September 15, 2025

AI blog

Announcements

Architectures and Processors blog

Automotive blog

Embedded and Microcontrollers blog

Internet of Things (IoT) blog

Laptops and Desktops blog

Mobile, Graphics, and Gaming blog

Operating Systems blog

Servers and Cloud Computing blog

SoC Design and Simulation blog

Tools, Software and IDEs blog

Cortex-M0+ a year after: smaller, thriftier and smarter!

Scalable Matrix Extension: Expanding the Arm Intrinsics Search Engine

Arm A-Profile Architecture developments 2025

When a barrier does not block: The pitfalls of partial order