The most recently released Arm Client CPU, Arm Cortex-A76, provides a new premium mobile experience, boasting laptop-class performance with mobile efficiency. This efficiency has been proven in early Arm CPUs within the Windows laptop ecosystem which now have integrated Arm-based SoCs from Qualcomm into their initial Always-Connected PC offerings. As noted in Rene’s blog, these Arm-based laptops are already paying off with an unprecedented 20-plus hours of battery life and steadily improving performance thanks to ongoing OS and application optimizations.
A few weeks ago, Nandan shared the Client CPU roadmap to follow Cortex-A76 through 2020, with Deimos and Hercules further increasing the performance and efficiency of client CPUs; all optimized for laptop-class performance and the 7nm and 5nm process nodes.
Herein lies the challenge: how to translate the year-over-year performance improvements of >15% for compute through 2020 in silicon? How does your implementation keep up with the cadence of new cores and improvements in both performance and efficiency? The new microarchitecture of Cortex-A76 and the Arm DynamIQ technology are new to many design teams, and the advanced process nodes present new implementation challenges. How do you ensure that the increased productivity of the new Client CPUs are echoed in the productivity of your implementation teams?
Arm POP IP is the answer. Arm POP IP is already enabling laptop-class performance in 7nm SoCs based on Cortex-A76 by giving our partners the opportunity to push the clock-speed past 3.0GHz and up to 3.3GHz within power envelopes around half that of today’s mass-market x86 processors. POP IP adds value in four key ways, helping you keep up with the cadence and bring the latest Arm processor cores to market.
POP IP solves implementation challenges at advanced geometries before you experience them. The 16/14nm finFETprocesses brought us new challenges including very strict design rules impacting placement and routing, and 7nm brings additional placement rules, a new style for VIA connection – the VIA ladders – and the idea of explicit coloring requiring updates to the physical IP and the design tools. EUV processes will have their own challenges that impact the physical IP and the tools, and POP IP continues to provide solutions.
A great feature of POP IP is completely behind the scenes. Internal to Arm, we are able to begin early processor trials with the physical IP and the POP IP methodology concurrently for new cores. This begins an iterative flow that also benefits from the collaboration between Arm, EDA vendors, and the foundry process teams. This portion of the ecosystem and our joint efforts result in refinement of the core, the Artisan physical IP, the SoC implementation methodology, and the process itself.
The result of development and implementation learning internal to Arm is delivered to you as part of the POP IP contents: in the user guide and reference scripts supporting the final implementation
Some of the newer implementation challenges center around the very flexible DynamIQ technology, which has redefined Multi-Core Computing. The DynamIQ Shared Unit (or DSU) offers synchronous and asynchronous interfacing. We are seeing the asynchronous configuration commonly used, which presents constraint and timing challenges. Crosstalk must be controlled in the long channels to and from the CPU slaves. And both the cache interface and the architectural clock gating present new challenges, solved for you in POP IP.
As we move to ever smaller nodes, and work in processes with ever low voltages, variation presents new challenges. We address these as part of the reference scripts and with the deliverables offered as part of the Artisan physical IP, which includes both 2-dimensional AOCV derate files, along with the more accurate LVF deliverables. The scripts and models enable detailed variation analysis, helping take the worry out of the impact of variation. Arm continues to push for increased accuracy through an adopted proposal to include higher order moments in LVF modeling, which are required for LVF to reflect the skewed distribution of variation seen at low voltages.
Development of POP IP starts by selecting a core configuration, a process node, and a market-based power, performance and area (PPA) goal. Refinement of the implementation flow and Artisan physical IP uses the internal knowledge of the core, process, and implementation teams to work through our iterative methodology. The result is optimized Artisan physical IP delivered with an optimized implementation flow and the margining details used to achieve the quoted PPA. End-users don’t have to guess at the flow, or which libraries introduce when, or which tool options will produce the PPA.
POP IP is flexible by giving users the option to run the reference flow without changes; or select only the Artisan physical IP, only the implementation recipe, or pieces of each, as desired. In addition, the Artisan physical IP can be used for core implementation, or for any block on the SoC. A more recent addition to POP IP implementation is the Artisan Power Grid Architect (PGA). PGA is used in the development of POP IP and is provided to end-users for fast floorplan and power network explorations, should users choose to change the floorplan included in the POP IP.
Additional flexibility is provided with POP IP availability across foundries for multiple Arm cores with support for the latest EDA tool flows.
The implementation challenges for new cores, new processes and new implementation methodologies are addressed with POP IP. Use POP IP to translate the substantial gains in CPU compute onto silicon. Extend the productivity of the new Arm Client CPUs to your implementation team and make your design efficient in terms of compute cycles, the new Arm cores and your design teams. Keep up with the cadence by adopting Arm POP IP for the most advanced Arm Client CPUs.
POP IP Resources