In the first and second parts of this blog series, I have explored the challenges in designing reliable power-delivery networks, the roles of various elements of these networks and where Arm Research has focused its efforts to maximise the benefits for such networks. In this third and final part, I’ll discuss this work in more detail.
In Arm Research, we have designed an on-chip digital storage oscilloscope that can directly sample the power-rails. The introspection capability provided by the oscilloscope can enable the system designer to probe any potential runtime bugs due to power delivery weaknesses. Furthermore, the oscilloscope readings may also be used to drive learning algorithms to automatically generate voltage noise vectors that can be used to guardband production systems.
Figure 1. On-Chip Digital Storage Oscilloscope
Figure 1 shows the architecture of the on-chip oscilloscope. It consists of a VCO-based voltage sampler circuit that samples a ring-oscillator (RO) frequency to compute the supply voltage. Additional triggering logic provide capability of comparing a voltage ‘undershoot’ or an ‘overshoot’ against pre-set thresholds and storing the resulting voltage waveform into an on-chip SRAM buffer. There are event-counters and tidemarks that provide additional statistics regarding power supply noise conditions. Furthermore, the oscilloscope provides a load circuitry that essentially stresses the on-chip PDN with known stimulus. By measuring the response to the stimulus, it is possible to characterize the PDN and measure parameters such as its first-order resonance frequency.
Visibility of the supply noise conditions during dynamic operation goes a long way towards addressing power delivery concerns. Unfortunately, such specialized circuitry is not yet a standard feature in most high-end Arm systems. In our recent collaborative research with the University of Cyprus, we have successfully developed a more generalized approach of monitoring voltage noise in processors. This technique relies upon sensing modulations in the emanated electromagnetic (EM) radiation of a processor using an external antenna and a spectrum-analyzer. Every CPU acts as a radiating antenna due to time varying current consumption as a result of program activity. Hence, this approach is potentially extensible to any platform that does not currently have support for direct measurement of voltage noise.
In Figure 2 below, we capture a trace of the electromagnetic emanation from the Arm Juno-R2 platform on a spectrum analyzer. Sure enough, the frequency content on the EM radiation matches with the dominant frequency component of the voltage noise as captured by an on-chip sensor.
Figure 2. A trace of the voltage noise captured using an on-chip sensor is shown on the voltage-noise trace on the left plot. The frequency component of the trace (67MHz) matches exactly with the EM radiation captured using an external spectrum analyser.
Our research shows how capturing these modulations in the radiated energy can reveal key dynamical properties of the system such as identifying its natural resonance frequency. The system resonance frequency is a strong function of the power gating state of the computing cluster. For instance, in a multicore configuration, power gating a counterpart core significantly reduces the available on-chip capacitance on the power network. This causes a drastic shift in the system resonance frequency – a strong signature that can be captured in the EM frequency spectrum.
This technique is non-intrusive since it does not physically interact with the CPU being monitored. Consequently, it opens up an entirely new approach in computer architecture for characterizing and benchmarking high-end systems. The work has been recently published in the IEEE Computer Architecture Letters (CAL, December Issue) and has been recognised as one of the top three papers to appear in CAL for the year 2017. We were invited to present this work at the High Performance Computer Architecture Conference (HPCA), 2018 where we received the award.
Read the full paper
An extended version of the work will be published at MICRO 2018 at Fukuoka in Japan, and will be presented by Zacharias Hadjilambrou. Zacharias is a PhD student at the University of Cyprus, and has interned at Arm Research at various points across a 5-year collaboration since 2013.
Watch the talk summary
The EM research was funded under Project UniServer, an EU H2020-funded project. Project UniServer investigates the impact of supply noise in enterprise-class Arm systems and seeks to develop low overhead mitigation approaches. If you’d like any further information on the project, please reach out to me directly.