The memory sub-system is one of the most complex systems in a SoC, critical for overall performance of the chip. Recent years have witnessed explosive growth in the memory market with high-speed parts (DDR4/3 with/without DIMM support, LPDDR4/3) gaining momentum in mobile, consumer and enterprise systems. This has not only resulted in increasingly complex memory controllers (MC) but also PHYs that connect the memory sub-system to the external DRAM. Due to the high-speed transfer of data between the SoC and DRAM, it is necessary to perform complex training of memory interface signals for best operation.
Traditionally, MC and PHY integration was considered to be a significant challenge, especially if the two IP blocks originated from different vendors. The key reason was the rapid evolution of memory protocols and DFI interface boundary between controller and PHY being incompletely specified, or in some cases ambiguous, with respect to requirements for MC-PHY training.
I’ll try to shed some light on this topic. Recently, with the release of the DFI 4.0 draft specification for MC-PHY interface, things certainly seem to be heading in the right direction. For folks unfamiliar with DFI, this is an industry standard that defines the boundary signals and protocol between any generic MC and PHY. Since the inception of DFI 1.0 back in 2006, the specification has steadily advanced to cover all aspects of MC-PHY operation encompassing all relevant DRAM technology requirements. The DFI 4.0 specification is more mature compared to previous releases and specifically focuses on backwards compatibility and MC-PHY interoperability.
But that’s not the only reason why MC-PHY integration has gotten easier. To understand this better, we need to examine how MC and PHY interact during training. There are 2 fundamental ways that training of memory signals can happen:
Interestingly, PHY IP providers have decided to take ownership of training by implementing support for PHY independent mode in their IP, thereby keeping the reins to optimize the PHY training algorithms based on their PHY architecture. With PHY complexity growing and challenges with closing timing at high DDR speeds, the support for PHY independent mode training adds a valuable differentiator for PHY IP providers.
With the PHY doing most of the heavy lifting during the training, the MC only needs to focus on two questions:
The MC thus deals with the PHYs request for independent-mode training as an interrupt, something it needs to schedule along with a multitude of other things that it does for best memory operation. Training thus becomes a Quality-of-Service (QoS) exercise for the controller with a different set of parameters to optimize. The positive about all this is that QoS is essentially what a good MC does very well.
With the clarity at the DFI interface, silicon proof is really a burden on the PHY because it has to train correctly at high speeds and provide a good data eye. Risk for critical bugs in MC that can only be found through silicon proof is low, something that a strong architecture and design/verification methodology can help eliminate. So the demands on MC have become less on MC-PHY interoperability, but more so on performance (memory bandwidth and latency).
I am leaving that as the topic of my next blog.
ARM is building state-of-the-art memory controllers with emphasis on CPU-to-memory performance, and supporting DFI-based PHY solutions available in the market today. We have setup partnerships with 3rd party PHY providers for ensuring that integration at the DFI boundary is seamless for the end customer. ARM’s controllers support all the different training modes used by different PHYs thereby providing customers flexibility in choosing the best overall solution for their memory sub-system deployment.
Thanks for reading my blog, I welcome your feedback.