The Foundation for Next Generation Heterogeneous Devices

When we look at what’s happening in consumer electronic devices, you can see there is a clear evolution path. The smartphone has become the central computing device for most people and it is now being augmented by wearable devices. On the higher end, tablets continue to replace laptop purchases and the emergence of a premium tablet class and convertables makes the proposition even more compelling. Across all form factors there is a common thread, devices have responded to the consumer need to be constantly on, adaptable to its surroundings, and comfortable with multi-tasking across many applications.

The increase in pixel count on our screens and in our visual content means that our digital lives are in ultra high definition, and moving to 4K. The implications are that the system bandwidth must stay at least one step ahead in order to avoid bottlenecks, as users have grown accustomed to seamless computing where every command is carried out instantaneously. In reality, it doesn’t matter how powerful a CPU or GPU becomes, if there is not enough memory bandwidth in the system it will give a sluggish performance. It is clear then that next generation SoCs require a holistic view to optimize performance.

The answer lies in the system


As the mobile market is maturing, SoC designers have come to realise that one of the key routes to optimizing performance is through the system. It is through system performance that the next generation of SoCs will differentiate.

At ARM we have recognized that and have been focusing on developing system IP that gets the most out of silicon. All of our IP, Cortex® processors, Mali™ GPU and CoreLink™ System IP, are all designed, validated and optimized together to ensure the best performance per watt. We work closely with our partners to ensure the mobile devices of tomorrow can deliver experiences that continue to amaze consumers.

User experience (6 IP blocks).png

User experience is dependent on these IP blocks




System-optimized IP enables greater SoC differentiation

  • Reduced CPU latency
  • Efficient utilization of interconnect and memory bandwidth
  • Quality of Service (QoS) guarantees
  • Faster design cycle

CoreLink System IP Drives Innovation for Mobile Devices



ARM has launched two new System IP products that provide the foundation for next-generation SoCs, enabling new computing possibilities through increased system performance, improved power savings and better system integration.The CoreLink CCI-550 Cache Coherent Interconnect is a best-in-class AMBA interconnect for ARMv8-A systems. The CoreLink DMC-500 Dynamic Memory Controller is a low power, performance-optimized mobile memory controller with LPDDR4/3 support.




CoreLink CCI-550


CoreLink CCI-550 is the latest product in the market-leading ARM CoreLink Cache Coherent Interconnect family. Previous generation interconnects have been used in many millions of devices across multiple market segments, from mobile applications to smart TVs, automotive infotainment and cost-effective networking.


CoreLink CCI-550 delivers improvements in three key areas:


More bandwidth, less latency:

Greater than 60% of peak system bandwidth. This means that CoreLink CCI-550 is built and optimized for applications that require high bandwidth throughput to provide fluid, responsive applications and user interface, acceleration for apps, including video and photo editing, and improved multi-tasking and multi-windowing (compared to CCI-500).

QoS enhancements can reduce latency within the CCI by up to 20%

2x snoop hit bandwidth that extends efficiency across the system         


Advanced power efficiency           

Enables a fully coherent GPU which simplifies software and increases performance. Hardware coherency enables shared virtual memory and removes the need to copy data and the time consuming software managed cache maintenance.

Integrated snoop filter can save 100's of mW of memory system power


Scalability

Extensive configurability including 1 to 6 ACE ports means it can be optimized for a wide range of applications from premium tablet in the high end, as well as down to smaller or cost-sensitive designs

Memory interfaces are scalable from 1 to 6, supporting high performance tablet requirements with 4K internal and external screens, and bandwidths exceeding 50GB/s



CCI-550 system bandwidth (slide).png

CoreLink CCI-550 enables a fully coherent GPU. Fully coherent memory systems can unleash the heterogeneous computing power of CPU and GPU simultaneously. It’s an exciting new area for mobile computing and holds the potential for many applications that would benefit enormously from the extra processing power a GPU could provide



Coherent GPU use cases.png

Example use cases of a fully coherent GPU



CoreLink DMC-500


The CoreLink DMC-500 offers lowest latency supporting LPDDR4/3 memories at up to LPDDR4-4266+ transfer speeds. The CoreLink DMC-500 along with the CoreLink CCI-550 provides the best end-to-end performance from CPU to memory at lowest power while ensuring that important system level functions such as coherency, QoS and TrustZone security are fully supported. It offers leading performance in the following ways:


Highly Optimized, efficient memory access

  • 27% increase in memory bandwidth utilization
  • Latest LPDDR4/3 memory support up to LPDDR4-4267
  • Low power design and operating modes

End to end quality of service

  • 25% reduction in average CPU latency
  • Complete solution with CoreLink interconnect

Integrated solution

  • TrustZone™ security and media protection for DRM content
  • Supports industry standard DFI 4.0 PHY interface
  • Integrated memory scheduling and memory controller enables highest utilisation


CoreLink DMC-500 improvements.png

Increasing memory bandwidth and reducing latency will bring features like immersive mobile gaming, 4K content and screen display, and 120fps video playback into the reckoning for next generation devices.
CoreLink DMC-500 extends the performance and low power leadership of ARM systems for the advanced LPDDR4/3 memories:



Single DFI 4.0 memory interface supporting X16 LPDDR4 up to DDR-4267 and X32 LPDDR3 up to DDR-2133, dual-DMC channel support for X32 LPDDR4

    • Support for clock gating, dynamic frequency change and memory low power modes for optimized power consumption

System-Optimized IP Enables Seamless Computing



Mobile with Mimir.png


QoS is way of prioritizing traffic dynamically across system masters. Masters can be categorized into three broad classes:

  1. Latency-sensitive masters - benefit most from lowest latency response from memory, for example CPU
  2. Greedy masters (or bulk transfer) - capable of submitting many requests to memory but no firm deadline
  3. Real-time masters - firm deadline by which response must be received, for example display controller

CoreLink CCI-550 and CoreLink DMC-500 have been designed with a system-wide QoS that has been validated to work CoreLink NIC-450, Cortex A53 and Cortex A72 processors and Mali GPU.

QoS offers system-wide flexibility to tasks that optimize the system performance of the Cortex processors and Mali GPU.  In benchmark tests, QoS enhancements have shown up to a 25% CPU latency reduction across the chip, which directly translates to faster performance.

System Optimized IP enables seamless computing (slide).png

TrustZone Secure Media Path to provide end-to-end protection for Ultra-HD content from Mali to memory. Together with the CoreLink DMC-500, which contains an integrated TrustZone controller, it ensures minimal latency, making the possibility of watching Netflix in 4K quality on your mobile device a distinct reality. And who could say no to that?

Users have come to expect more from our mobile devices as they have become our primary computing devices. The challenge for the semiconductor industry is to keep up with the demand, and there is a realization that the system performance is what really counts.

CoreLink System IP will be fundamental in the performance and functionality increases of next-generation SoCs for mobile devices, and I look forward to seeing the next jump in mobile user experience enabled by these.

Further information:

Press Release

CoreLink CCI-550

CoreLink DMC-500

Extended System Coherency - Part 1 - Cache Coherency Fundamentals

GPU Coherency uni paper

Anonymous