Partnerships and the Myths of System Resilience

At the 55th Design Automation Conference 2018

The Design Automation Conference (DAC) brings together an interesting mixture of EDA and IP companies enabling tool users, licensees and partners to meet. Automotive and functional safety continued to be major themes at DAC. They are important topics for our industry as it adapts to address one of the fastest growing markets for System-on-Chips (SoCs).

Three Key themes from DAC

1. Silicon Intellectual Property

There was significant attendance of the IP track presentations and poster sessions. It was highlighted by the presence of many IP vendors and of course, the endurance of the Denali party, now organised by Cadence, and cosponsored by partners including Arm.

2. The Importance of Partnerships and Ecosystems

Partnership and ecosystems are an important theme at DAC, it’s the nature of being brought together. Arm is not only a large consumer of EDA, with tool licences from many vendors extending beyond Synopsys, Cadence Design Systems and Mentor Graphics, we’re also partners. Our collaboration includes feedback from early adoption and enabling tool compatibility. It’s also essential that Arm’s IP models enable our silicon partners as well as their customers, so they must work effectively with EDA modelling and emulation platforms. Modelling by automotive OEMs and Tier-1s is a growth area as the industry ups its pace.

Arm’s business is built on partnership and one less talked about aspect is our third-party IP ecosystem. It takes many IP components and tools to realize a SoC, making interoperability of IP essential.

Arm plays a key role in improving interoperability, for example though the Arm AMBA specifications. These specifications are developed with partners from across the semiconductor industry. Contributors include EDA and IP vendors, covering applications such as Verification IP and on-chip interconnect.

My presence at DAC was to advocate good practice in system design and strengthen Arm’s relationships with interconnect vendors. Our goal is to enable a solution that ensures Arm remains the most cost-effective choice as we strive to enhance capability and reduce the friction of integration. Over the coming months, we will do more to highlight the complement of third-party interconnects within Arm’s automotive ecosystem, as such products are a valid alternative to Arm’s own interconnects.

3. The Myths of System Resilience

A topic of great interest to many in the automotive space is how resilient and functionally safe systems can be realized. At DAC, I presented on the “Myths of system resilience,” it challenged misconceptions and shows a better, more scalable way of designing diagnostic capability across the components of an SoC to detect faults. You can watch the video interview on Youtube.

In discussion with automotive systems companies who use SoCs, we hear the first myth.

  • There is a common misunderstanding that generating error detection or correction codes (EDC/ECC) within a CPU or other data source is an effective way of protecting the interconnect and the peripherals at the data’s destination.

Such a monolithic approach to resilience is shown by Figure 1, which illustrates that the EDC protection is independent of the interconnect. This concept leads to the second myth.

  • Integrating several components developed for resilience in isolation leads to a resilient or safe system.

Monolithic approach to system resilience

Figure 1. Monolithic approach to system resilience

Resilience and functional safety design requires a system view extending across the SoC hardware and software integration. For the hardware, trying to protect the interconnect in a monolithic way is potentially disastrous because the designer of the IP at the data source must have intimate knowledge of the interconnect to protect against faults. Without tailoring the protection capability to each SoC design, it is inevitable that the protection will be either insufficient or over engineered.

In practice, it will likely be insufficient for functional safety because wrapping a bus transaction with an EDC/ECC won’t guard against all single point faults, especially those in multiplexors. Figure 2 illustrates the effect when a single wire in a multiplexor select tree is inverted. By following the red colour coding in the figure, the consequences and erroneous multibit output are visible.

 Diagram a single point fault in select tree causing multibit error

Figure 2. A single point fault in a select tree usually causes a multibit error

Multibit errors are likely to occur within interconnects. Parity and Single Error Correct Dual Error Detect ECC schemes are commonly considered for protection, but perform poorly at detecting multibit faults, even over several transactions. Other classes of EDC perform better but quickly expose the limitations of the monolithic approach.

Although Arm is never the integrator for production SoCs, our architects have a vital role in leading the industry. This role includes shaping the future of AMBA specifications to lower the cost of integrating Arm processors, including resilient system design.

Following extensive analysis, we determined that a modular approach to system resilience is needed as shown by Figure 3. With a modular approach, each component is designed by the engineers who understand it the best, that enables it to have dedicated and optimised fault detection mechanisms. This approach ensures each part is configurable with adequate diagnostic coverage, even when the designer has no awareness of the system context.

 Modular approach to system resilience

Figure 3. Modular approach to system resilience

With each component providing its own tailored capability, it’s the role of the SoC integrator to ensure that at the system level their safety or resilient goals are met. It’s also the role of AMBA to define the bus interfaces between the components and show a philosophy applicable to other interfaces.

When defining interfaces for resilience its essential to understand what is being protected, and within SoCs an interface between components manifests as short random wiring amongst random logic. This reality is very different to the regularly ordered ideal layout the automotive market sometimes imagines, so disambiguating the interconnect from its interfaces is essential to understanding and realizing an effective resilient SoC architecture.

Arm is already developing new generation IP aligned to the modular approach. Several interconnect providers offer resilient interconnects that complement Arm’s products.

Summary

Realizing a cost-effective system takes much more than a processor, that’s why Arm invests significantly in partnerships, ecosystem and architectural definitions to make Arm processors the lowest cost choice. It may be possible to skip these investments and offer a seemingly cheaper processor; however, its total cost of ownership that counts.

A modular approach to system resilience makes it clearer, easier and cheaper to realize a resilient system with Arm processors, especially when functional safety is required. Arm will continue to invest in system architecture and product engineering to facilitate shared success. You can see our current automotive product range at on our Arm Automotive solution page.

At the 55th DAC, many people from across the IP industry came together. Arm’s automotive developer community also unites people, and you can join or simply the link below.

Visit AADC

Anonymous