Arm Community
Arm Community
  • Site
  • User
  • Site
  • Search
  • User
Arm Community blogs
Arm Community blogs
Architectures and Processors blog Squaring the circle - Optimizing power efficiency in a Cortex-A15 processor
  • Blogs
  • Mentions
  • Sub-Groups
  • Tags
  • Jump...
  • Cancel
More blogs in Arm Community blogs
  • AI blog

  • Announcements

  • Architectures and Processors blog

  • Automotive blog

  • Embedded and Microcontrollers blog

  • Internet of Things (IoT) blog

  • Laptops and Desktops blog

  • Mobile, Graphics, and Gaming blog

  • Operating Systems blog

  • Servers and Cloud Computing blog

  • SoC Design and Simulation blog

  • Tools, Software and IDEs blog

Tell us what you think
Tags
Actions
  • RSS
  • More
  • Cancel
Related blog posts
Related forum threads

Squaring the circle - Optimizing power efficiency in a Cortex-A15 processor

Haydn Povey
Haydn Povey
September 11, 2013
4 minute read time.

It is entirely appropriate that ARM will announce technical details of its latest hard macro product, the Cortex™-A15 MP4 Hard Macro for TSMC 28HPM node at COOL Chips XV, the IEEE Symposium on Low-Power and High-Speed Chips, being held this week in Yokohama, Japan (18-20th April, 2012). This exciting new hard macro not only perfectly encapsulates the theme of the symposium, but also pulls together the contemporary and divergent design challenges of offering extremely high-performance compute engines within a conservative power budget.

The Cortex-A15 MP4 Hard Macro is a high performance, power-optimized quad-core hard macro implementation of our flagship Cortex-A15 processor, on leading 28nm process. It delivers three significant firsts for the ARM hard macro portfolio, as not only is this the first quad""core hard macro, but also the first hard macro based on the highest performance ARMv7 architecture-based Cortex-A15 processor, and it is also the first hard macro based on 28nm process.

In terms of configuration, the Cortex-A15 MP4 Hard Macro includes:

  • NEON™ and Floating Point Unit (FPU) technology
  • ECC for L1 and L2 RAMs (L1-I cache has single bit parity)
  • 2x32KB L1 and 2MB L2 caches
  • 224 interrupts, 6 power  domains
  • AMBA®  Protocol Domain Bridge, CoreSight™,  AMBA APB™, ATB, Funnel

The hard macro has been developed using ARM Artisan® 12-track libraries and Processor Optimization Pack™ (POP) solutions for the Cortex-A15 processor on TSMC 28nm HPM process.

I outlined in my earlier blog the three main challenges in modern SoC design, namely those arising from the rapid evolution of processor technology, the jumps in process implementation technology, and the ever present commercial challenges which have sharpened due to the recent global economic climate. I go on to demonstrate why ARM hard macros are a very exciting and credible solution for silicon vendors.

I will resist the urge to repeat the message here, but it is worthwhile noting that with every jump in complexity for the processor and the process node, there is a significant rise in the challenges, costs and risks associated with getting the SoC implementation just right and in time. Today, the SoC development challenge is perhaps highest when designing with the latest high-performance multicore processors such as the Cortex-A15 processor on leading geometries.

One of the biggest challenges in designing high-performance systems on the latest nodes is keeping the power profile and leakage levels really low. And it is here that the Cortex-A15 hard macro really excels, delivering a blistering performance of more than 2GHz and in excess of 20,000DMIPS, while maintaining the power efficiency of the Cortex-A9 hard macro. This makes this latest macro offering from ARM a real and timely boon to SoC designers venturing into what are for many, uncharted territories.

In order to achieve this low leakage high-performance implementation, some of the best brains at ARM pitted their expertise against a series of design challenges and decision points, across all stages of the implementation flow.

Consider the challenge of picking the right base library combinations from the various foundry process offerings on 28nm, several Vt options, channel length variations and literally thousands of cell choices. Picking the best multilateral combinations that would deliver the desired Performance, Power and Area (PPA) targets was a crucial first step on the way to success.

Then there was the challenge of managing the diverging needs of silicon vendors who wish to use the full entitlement of process geometry to build highly complex SoC, while the product developers focus on providing consumers with best in class battery life. It was clear that the Cortex-A15 hard macro would need some sophisticated power management schemes to ensure both needs were met adequately. The power grid for the macro was designed to support typical frequency at worst case process and operating conditions. The Cortex-A15 hard macro supports multiple power domains, and also supports DVFS across the two VSOC and VCORE voltage domains.

An interesting timing closure challenge for the design team was to overcome the limitations of the traditional fixed OCV (On-Chip Variations) and fixed margins, which are now running out of steam. For example, a 15ps increase in the margin can add 200% more hold buffers. The Cortex-A15 hard macro uses Advanced OCV (AOCV) techniques which provide more flexible margins but the lack of full EDA support for AOCV made things interesting.

I would love to go on further about the creativity of the design but I'm aware that my editors asked for a blog, not a whitepaper. 

It is fair to conclude that the unmatched power efficiency in this high-performance Cortex-A15 MP4 implementation was achieved by capitalizing on the vast implementation expertise available in ARM, and by leveraging the tight synergy that exists between ARM CPU, Physical and Fabric IP teams.

Anonymous
Architectures and Processors blog
  • When a barrier does not block: The pitfalls of partial order

    Wathsala Vithanage
    Wathsala Vithanage
    Acquire fences aren’t always enough. See how LDAPR exposed unsafe interleavings and what we did to patch the problem.
    • September 15, 2025
  • Introducing GICv5: Scalable and secure interrupt management for Arm

    Christoffer Dall
    Christoffer Dall
    Introducing Arm GICv5: a scalable, hypervisor-free interrupt controller for modern multi-core systems with improved virtualization and real-time support.
    • April 28, 2025
  • Getting started with AARCHMRS Features.json using Python

    Joh
    Joh
    A high-level introduction to the Arm Architecture Machine Readable Specification (AARCHMRS) Features.json with some examples to interpret and start to work with the available data using Python.
    • April 8, 2025