Using power management features of complex SoCs (system on chip) can be difficult due to the complexity of the design and multiple power domains. In this blog I discuss the power management hardware features of the ARM® CoreTile Express A15x2-A7x3 (V2P-CA15_A7 ) development board (CoreTile) with a Versatile™ Express V2M-P1 motherboard. This will include Dynamic Voltage and Frequency Scaling (DVFS) and Power Switching. The blog will enable the benefits of ARM's big.LITTLETM technology to be explored and a better understanding of how it works.
The CoreTile provides isolated power domains for the Cortex®-A15 cluster, Cortex-A7 cluster and the SoC. These are supplied by PSU0, PSU1 and PSU2 respectively on the board.
The PSU0, PSU1 and PSU2 power supplies have onboard current sensing resistors with external test-points to allow current measurement and power profiling as shown in Figure 2 Power Management system architecture. These supplies also have on-board ADCs for voltage, current and power measurement which is performed through the CoreTile using the on board microcontroller (DCC). On board energy meters are also provided for PSU0 and PSU1.
The PSU0, PSU1 power supplies can be switched off through the DCC. It directly interfaces with the Serial Power Controller (SPC) implemented within the CoreTile test-chip. The DCC is responsible for controlling DVFS, cluster power-up/down and cluster wake-up.
The onboard clocking scheme is also shown with the option of clocking the Cortex-A15 cluster from OSCCLK0/1/7 and the Cortex-A7 cluster from OSCCLK2/3/7. The SoC domain is driven by OSCCLK7. The default OSCCLK values are set by the boards board.txt file stored on the USBMSD. These can also be dynamically changed through the DCC using the performance tables in the board.txt file. The recommended way to interface to the DCC is through the test-chip SPC registers.
Power Management
The CoreTile power management provides the following three primary functions through the SPC (serial power controller) and SCC (serial configuration controller) interfaces to allow software implementation of the big.LITTLE Switcher / MP Model:
big.LITTLE system
The CoreTile test-chip supports all three software models that can be used to manage a big.LITTLE system: Global Task Scheduling, Cluster Migration and CPU Migration.
In big.LITTLE Cluster Migration only one of the CPU clusters is active at a time. Since the energy efficiency of the Cortex-A7 is better than the Cortex-A15, high performance applications can be executed on the Cortex-A15 cluster and medium and low performance applications can be executed on the Cortex-A7.Once the Cortex-A7 cluster has reached the highest performance operating point or the Cortex-A15 cluster has reached the lowest performance operating point the execution can be migrated to the other cluster. The design relies on CoreLinkTM CCI-400 for the fast migration. The data migration is enabled by access to the outbound side's cache from the inbound side via snooping, and this prevents expensive access to main memory.
The CPU Migration is analogous to the cluster migration, except that context is migrated for individual cores, rather than the whole cluster. Each LITTLE core is paired with a big core, to form a virtual core. The scheduler operates across virtual cores. The load is evaluated for each individual virtual core. If the load requirement is high the virtual core will be instantiated by big core, and if the load requirement is low by a LITTLE processor. As the load profile changes, the whole context of that core will be migrated between the big and LITTLE core as appropriate. This migration takes advantage of the CoreLink CCI-400. This component provides coherency which is not only used for migration, but also to allow data sharing across the clusters. With CPU Migration both clusters can be active at any one time.
For big.LITTLE Global Task Scheduling, the operating system scheduler operates across all the cores in the system. The scheduler is aware of the compute capacity differences between big and LITTLE cores. The scheduler places threads on big or LITTLE cores depending on each thread's performance requirement. The latter is determined from a threads historical load profile. An implementation of Global Task Scheduling is ARM's big.LITTLE MP linux kernel. This kernel will schedule tasks to the most appropriate processors, and will to power down inactive clusters, or cores for platforms that support core power gating. For example, when running only low intensive tasks, the kernel can chose to power down the big cluster and continue only on the LITTLE. As is the case with CPU Migration, both clusters can be active any one time. CoreLink CCI-400 is the key ingredient that makes this possible by providing coherency between the clusters.
Migration
In the big.LITTLE Cluster and CPU Migration models, the software handles all the mechanisms required to switch between clusters, such as the processor state save-restore, control snooping, migration of interrupts etc. But the migration sequence has some system dependencies.
For example, the inbound cluster has to be powered-up, brought out of reset and isolation has to be removed. Once the inbound cluster is up and running, then the outbound cluster has to be put in reset, isolated, and then powered-down. The software uses the test-chip SPC interface to communicate with the external CoreTile DCC which handles all the switching mechanics. DVFS interface
Dynamic Voltage and Frequency Scaling allows the operating system to pick the optimal voltage and frequency for a particular load requirement reducing the dynamic power consumption of the system.
It is the responsibility of the operating system to determine the required performance level based on the expected load on a cluster. This desired performance level is a value between 0 and 7 which is passed to the DCC through the SPC registers. Writing to these registers results in an interrupt to the DCC which then reads the SPC registers and translates the performance level value to a voltage and frequency value. Power-down and Power-up modesThe CoreTile has three power islands relevant to power management, the Cortex-A15 cluster, Cortex-A7 clusters and SoC. The Cortex-A15 and Cortex-A7 clusters support several levels of power management. A block diagram of the test-chip and power supplies is shown below.
Power measurement
As mentioned in previous section the CoreTile has on board current sensing resistors:
Two pin Jumper headers allow an external meter to be used to determine the current consumption on each supply. J16 is not fitted by default as the internal AXI includes many interfaces and a full understand of the measured current consumption is system dependent.
On board power measurement
The CoreTile includes on board current, voltage and power measurement of PSU0/1. These values are accessed through the Versatile Express motherboard SYS_CFGCTRL or CoreTile SPC_SYS_CFG interfaces, for more information refer to the CoreTile TRM.
On board energy meter
The CoreTile includes two on board energy meters for reading PSU0/1 energy consumption. These sample at 10KHz and are also available through the Versatile Express motherboard interface. For more details please refer to the "Energy Meter' section in the CoreTile TRM and read one of my colleague's blog about use of the ARM Development Studio 5 (DS-5™) and Streamline™ performance analyzer.
The Application note AN318 CoreTile Express A15x2 A7x3 Power Management goes into the sequencing off the different power states and control of the various interfaces in greater detail.
Also there is a video of a demo showing the operation of big.LITTLE here.