At last the time has finally come when I can talk about AMBA 5 - the next generation of ARM interface standard.
To start this discussion it is worth answering a question I frequently get asked, which often goes along the lines of what is the difference between "AMBA 4" and "ACE."
The term "AMBA x" is used to refer to a generation of interface standards that are developed around the same time. Within each generation there are different interfaces to target different needs within an SoC, but one key aspect is that these all work well together. I'm sure it's clear to all that it's not appropriate to have a 200 pin interface with multiple outstanding transactions, re-ordering and all the signalling required to support coherent caches when all you need to do is to program a handful of configuration registers. So, within each generation of AMBA the aim is to provide a set of interfaces that target all the needs of today's complex SoC designs.
AMBA 5 CHI is targeting the interface to the coherent hub that is found in many of today's SoCs, hence the name "Coherent Hub Interface." We define the hub as the high performance interconnect central to the SoC responsible for connecting the high performance processors and memory controllers.
Back in the early days of on-chip coherency it wasn't all that clear about how much support would be needed for coherency and how widely adopted it would become within any given SoC. It was always clear that there would be a few key components that would need to have fully coherent caches, but whether that would spread far across the SoC was less clear. The landscape today is that coherency is being more broadly adopted. The trend in enterprise today is more about an increasing number of coherent CPUs rather than a whole scale adoption of full coherency by all the processing engines within the SoC. In mobile, coherency is already used for big.LITTLE processing and we will see technologies like Heterogenous System Architecture (HSA) drive to make the GPU fully coherent.
AMBA 5 CHI has been developed with the increasing number of coherent requesters very much in mind and some of the key features apply as much in the non-coherent world as in the coherent world. It is important to be able to design larger and larger systems, with complex interconnect topologies, where the behaviours of the individual components do not impact on the overall performance and efficiency of the system. In some ways this is the essence of being scalable. To enable this, CHI ensures that the interconnect never becomes the bottleneck in the system. There should never be a time when critical system resources (which are invariably the memory controllers) are inefficient because of the interface and interconnect being used.
It's not only important to be able to ensure large numbers of components don't interfere with each other, but also to be able to control how they share the resources that are available within the SoC. Composability of these systems is needed to be able to control the overall system Quality of Service (QoS) in a scalable fashion. To address this CHI provides a QoS mechanism to control how the resources in the system are allocated without needing a detailed understanding of every component and how they might interact..
As is so often the case, there are two somewhat opposing requirements when developing a new specification. On the one hand, it's important to move forward with new developments, but on the other hand it's important to not keep changing the goal posts. From that perspective, CHI builds on the same coherent protocol that is used in AMBA 4 ACE. This also ensures that IO coherent devices, which can make use of ACE-Lite, and which make up a large number of the devices participating in the coherent world can be interfaced efficiently to CHI.
It is at the lower layers of the interface, which control how the information is exchanged across the interface and routed across the interconnect that some of the biggest advances occur. Two factors that contribute to improving interface performance are a higher clock speed handshake, and a clean separation of the information required for transport and the payload. This simplifies the interconnect transport task of shipping payload from one port of the interconnect to another. With these advances in CHI, interconnects can be constructed that are more efficient and can run at higher clock frequencies.
With AMBA 5 CHI having its focus on the coherent hub the question arises about the rest of the SoC - does CHI have a role to play there? Whilst the CHI specification itself focuses on the coherent hub, there are certainly many of the advances that apply equally well to other areas of the SoC and we will continue looking at the other aspects of on-chip communication in the AMBA 5 generation of specifications.
[CTAToken URL = "https://www.arm.com/products/system-ip/amba-specifications" target="_blank" text="More information on AMBA specifications " class ="green"]