The Top 5 Things to Know about Cortex-A53

The Arm Cortex-A53 was introduced to the market in October 2012, delivering the Armv8 instruction set and significantly increased performance in a highly efficient power and area footprint. It is available for licensing now, and will be deployed in silicon in early 2014 by multiple Arm partners. There are a few key aspects of the Cortex-A53 that developers, OEMs, and SoC designers should know:

1. Arm low power / high efficiency heritage

The Arm9 is the most licensed processor in Arm’s history with over 250 licenses sold. It identified a very important power/cost sweet spot.The Cortex-A5 (launched in 2009) was designed to fit in the CPU same power and area footprint,

Nokia E60 mobile phone image

Arm926-based feature phone (Nokia E60).

While delivering significantly higher performance and power-efficiency, and bring it to modern Armv7 feature set – software compatibility with the high end of the processor roadmap (then Cortex-A9)

Arm Cortex-A5, Cortex-A7 and Cortex-A53 stats

The Cortex-A53 is built around a simple pipeline, 8 stages long with in-order execution like the Cortex-A7 and Cortex-A5 processors that preceded it. An instruction traversing a simple pipeline requires fewer registers and switches less logic to fetch, decode, issue, execute, and write back the results than a more complex pipeline microarchitecture. Simpler pipelines are smaller and lower power. The high efficiency Cortex-A CPU product line, consisting of Cortex-A5, Cortex-A7, and Cortex-A53, takes a design approach prioritizing efficiency first, then seeking as much performance as possible at the maximum efficiency. The added performance in each successive generation in this series comes from advances in the memory system, increasing dual-issue capability, expanded internal busses, and improved branch prediction.

2. Armv8-A Architecture

The Cortex-A53 is fully compliant with the Armv8-A architecture, which is the latest Arm architecture and introduces support for 64b operation while maintaining 100% backward compatibility with the broadly deployed Armv7 architecture. The processor can switch between AArch32 and AArch64 modes of operation to allow 32bit apps and 64bit apps to run together on top of a 64bit operating system. This dual execution state support allows maximum flexibility for developers and SoC designers in managing the rollout of 64bit support in different markets. Armv8-A brings additional features (more registers, new instructions) that bring increased performance and Cortex-A53 is able to take advantage of these.

3. Higher performance than Cortex-A9: smaller and more efficient too

The Cortex-A9 features an out-of-order pipeline, dual issue capability, and a longer pipeline than Cortex-A53 that enables 15% higher frequency operation. However the Cortex-A53 achieves higher single thread performance by pushing a simpler design farther – some of the key factors enabling the performance of the Cortex-A53 include the integrated low latency level 2 cache, the larger 512 entry main TLB, and the complex branch predictor. The Cortex-A9 has set the bar for the high end of the smartphone market through 2012 – by matching and exceeding that level of performance in a smaller footprint and power budget, the Cortex-A53 delivers performance to entry level devices that was previously enjoyed by high-end flagship mobile devices – in a lower power budget and at lower cost. The graph below compares the single thread performance of the high efficiency Cortex-A processors with the Cortex-A9. At the same frequency, Cortex-A53 delivers more than 20% higher instruction throughput than the Cortex-A9 for representative workloads.

Graph of single thread performance Cortex-A versus Cortex-A9

4. Supports big.LITTLE with Cortex-A57

The Cortex-A53 is architecturally identical to the higher performance Cortex-A57 processor, and can be integrated with it in a big.LITTLE processor subsystem. big.LITTLE enables peak performance and extreme efficiency by distributing work to the right-sized processor for the task at hand.

It is described in more detail here - Ten Things to Know About big.LITTLE

Diagram Cortex-A53 combined with Cortex-A57 and Mali-T628 in CCI system

The diagram above shows Cortex-A53 combined with Cortex-A57 and a Mali-T628Graphics processor in an example system. The CCI-400 cache coherent interconnect allows the 2 CPU clusters to be combined in a seamless way that allows software to manage the task allocation in a highly transparent way, as described in <link – software>. The big.LITTLE system enables peak performance at low average power.

Cortex-A53 in ideal for use in a standalone use scenario, delivering excellent performance at very low power and area enabling new features to be supported in the low cost smartphone segments  Our new LITTLE processor packs a performance punch.

Read more about that in a somewhat humorous blog on Cortex-A53 from the product launch – Arm Cortex-A53 — Who You callin' LITTLE?

5. Extensive feature set for broad application support

The Cortex-A53 includes a feature set that allows it to be configured and optimized through physical implementation tailored to mobile SoCs and to  scalable enterprise systems.

Mobile Features Enterprise Features
AMBA 4 ACE Coherent busbig.LITTLE processing (2 CPU Clusters) with CCI-400 interconnect AMBA5 CHI Coherent bus Scalable to 4 or more coherent CPU clustersfor low-cost servers or networking infrastructure devices. 16-core systems with CCN-504 or 32-core systems with CCN-508 – all on a single silicon die.
Small area, low power design Optimized for Small area, low power design. Likely still optimized for 150 mW. However, higher performance implementations can be used
ECC, parity available, but configurable if not needed ECC and parity protection required for enterprise applications

Learn more about Cortex-A53