Arm Community
Arm Community
  • Site
  • User
  • Site
  • Search
  • User
Arm Community blogs
Arm Community blogs
Architectures and Processors blog AMBA 4 ACE and Hardware Cache Coherency - Top 5 Questions
  • Blogs
  • Mentions
  • Sub-Groups
  • Tags
  • Jump...
  • Cancel
More blogs in Arm Community blogs
  • AI blog

  • Announcements

  • Architectures and Processors blog

  • Automotive blog

  • Embedded and Microcontrollers blog

  • Internet of Things (IoT) blog

  • Laptops and Desktops blog

  • Mobile, Graphics, and Gaming blog

  • Operating Systems blog

  • Servers and Cloud Computing blog

  • SoC Design and Simulation blog

  • Tools, Software and IDEs blog

Tell us what you think
Tags
  • Cortex-A53
  • AMBA
  • Cortex-A57
  • Corelink
  • Cortex-A15
  • CoreLink CCI-400
  • ACE
  • Cortex-A
  • Cortex-A7
  • coherency
Actions
  • RSS
  • More
  • Cancel
Related blog posts
Related forum threads

AMBA 4 ACE and Hardware Cache Coherency - Top 5 Questions

Neil Parris
Neil Parris
October 14, 2013
2 minute read time.

I thought I'd post a short blog post about commonly asked questions on AMBA 4 ACE and system coherency.

What does ACE mean?

ACE is the "AXI Coherency Extensions" introduced with the AMBA 4 specification released in 2011. For those of you thinking "What's AXI?" it's an on-chip bus standard used to define the interface and signalling to connect processors, interconnect, memory controllers to make an SoC.

 

Why?

ACE allows processors and systems to share memory in a more efficient way called "hardware coherency". Before hardware coherency systems had to rely on software coherency which means that the application, drivers and operating system must carefully manage the sharing of any data between the processor and other system hardware like DMA, graphics and IO interfaces. This software coherency consists of carefully timed cache cleaning, maintenance and invalidations. This cache cleaning takes time and effort (cache contents need to be written out to main memory, DDR), and any mistakes can be very difficult to debug (sometimes data is just in the wrong place and it's not obvious why).

Hardware coherency removes the software challenges, and in fact makes sharing transparent to the application. Hardware coherency is a critical component to @big.LITTLE processing and allows the big and LITTLE processor clusters to see the same view of memory, and run the same operating system. Processes and applications can switch between the big and LITTLE cores as demand requires.

3rd Party Support?

AMBA 4 ACE is an open standard, this means it's freely available to download from the ARM website. Of course there's an ecosystem of EDA companies out there supporting this new standard including Cadence, Jasper, Mentor and Synopsys.

Which Processors?

The latest @Cortex processors all support AMBA 4 ACE, these include the big little pairs: ARMv7 Cortex-A15 & Cortex-A7, and the ARMv8 Cortex-A57 & Cortex-A53. While these processors will be used in big.LITTLE applications we'll also see them used in enterprise applications like networking and servers where hardware coherency is a must have for high performane interfaces like PCI Express, Ethernet and USB.

How do I connect 'ACE' components?

The ARM CoreLink CCI-400 Cache Coherent Interconnect is the first product to market to support AMBA 4 ACE. First released in 2011, CCI-400 has been licensed by over 20 ARM partners and you will see many big.LITTLE products announced during 2013. For those not familiar with SoC architecture, the 'interconnect' is the glue that connects all the building blocks that make up an SoC like Cortex processors, Mali graphics and CoreLink memory controllers.

Any more questions?

Please ask!

Anonymous
  • wangyong
    wangyong over 11 years ago

    Thanks a lot!

    • Cancel
    • Up 0 Down
    • Reply
    • More
    • Cancel
  • Neil Parris
    Neil Parris over 11 years ago

    Hi Wangyong, great question. We made a change in the most recent release, r1p4 (March 2014), which allows the logic from un-used ports to be removed. There are the following new parameters that allow you to configure the number of ACE-Lite slave ports and the number of ACE-Lite master ports (see table below). With this additional configuration you can reduce the area and power that CCI-400 consumes if some ports are not required.

    Hope this helps,

    Neil.

    Ports

    Parameter

    Supported Values

    ACE slave ports

    (fixed)

    2

    ACE-Lite slave ports

    NUM_ACE_LITE_SI

    0-3

    ACE-Lite master ports

    NUM_MI

    2-3

    • Cancel
    • Up 0 Down
    • Reply
    • More
    • Cancel
  • wangyong
    wangyong over 11 years ago

    Hi Neil,

    CCI-400 is fixed configuration: 2 full ACE slave interfaces, 3 ACE-Lite I/O Coherent slave interfaces and 3 master interfaces. So if I only use 2 full ACE slave interfaces and 1 ACE-Lite slave interfaces, are the another 2 ACE-Lite slave interfaces wasted ?

    Thanks!

    • Cancel
    • Up 0 Down
    • Reply
    • More
    • Cancel
  • Neil Parris
    Neil Parris over 11 years ago

    Hi Andy - that's a great question.

    At the simplest level it's about performance and complexity, the performance of the bigger Cortex-A cores is much higher, and to reach this high performance they have many parallel requests into the memory system at the same time. This is partly down to the fact that everything is running at a higher frequency, which means more pipe-lining, and in turn more latency. To combat latency we need more transactions in flight, and this requires a more advanced bus like AXI.

    A smaller microcontroller based on Cortex-M can get its work done with just 1 request into the system at a time. The frequencies and latencies in the system are lower, and the workload on the processor is much lighter than say a Cortex-A57. Many of the Cortex-M cores will have multiple AHB busses to allow them to run a few transactions in parallel, e.g. data accesses to peripherals in parallel to an instruction fetch on a different port.

    I will expand on this in a follow on blog post summarizing the different AMBA standards.
    Thanks!

    Neil.

    • Cancel
    • Up 0 Down
    • Reply
    • More
    • Cancel
  • Andy Frame
    Andy Frame over 11 years ago

    Hi Neil,

    This is a great introduction to ACE and its use in big.LITTLE systems.

    Can you tell me a little bit more about the key top-level differences between an AXI system and an AHB system and why different processors use different bus standards ?

    Thanks

    Andy

    • Cancel
    • Up 0 Down
    • Reply
    • More
    • Cancel
Architectures and Processors blog
  • Scalable Matrix Extension: Expanding the Arm Intrinsics Search Engine

    Chris Walsh
    Chris Walsh
    Arm is pleased to announce that the Arm Intrinsics Search Engine has been updated to include the Scalable Matrix Extension (SME) intrinsics, including both SME and SME2 intrinsics.
    • October 3, 2025
  • Arm A-Profile Architecture developments 2025

    Martin Weidmann
    Martin Weidmann
    Each year, Arm publishes updates to the A-Profile architecture alongside full Instruction Set and System Register documentation. In 2025, the update is Armv9.7-A.
    • October 2, 2025
  • When a barrier does not block: The pitfalls of partial order

    Wathsala Vithanage
    Wathsala Vithanage
    Acquire fences aren’t always enough. See how LDAPR exposed unsafe interleavings and what we did to patch the problem.
    • September 15, 2025