Arm Community
Arm Community
  • Site
  • User
  • Site
  • Search
  • User
  • Groups
    • Research Collaboration and Enablement
    • DesignStart
    • Education Hub
    • Innovation
    • Open Source Software and Platforms
  • Forums
    • AI and ML forum
    • Architectures and Processors forum
    • Arm Development Platforms forum
    • Arm Development Studio forum
    • Arm Virtual Hardware forum
    • Automotive forum
    • Compilers and Libraries forum
    • Graphics, Gaming, and VR forum
    • High Performance Computing (HPC) forum
    • Infrastructure Solutions forum
    • Internet of Things (IoT) forum
    • Keil forum
    • Morello Forum
    • Operating Systems forum
    • SoC Design and Simulation forum
    • 中文社区论区
  • Blogs
    • AI and ML blog
    • Announcements
    • Architectures and Processors blog
    • Automotive blog
    • Graphics, Gaming, and VR blog
    • High Performance Computing (HPC) blog
    • Infrastructure Solutions blog
    • Innovation blog
    • Internet of Things (IoT) blog
    • Operating Systems blog
    • Research Articles
    • SoC Design and Simulation blog
    • Smart Homes
    • Tools, Software and IDEs blog
    • Works on Arm blog
    • 中文社区博客
  • Support
    • Arm Support Services
    • Documentation
    • Downloads
    • Training
    • Arm Approved program
    • Arm Design Reviews
  • Community Help
  • More
  • Cancel
Arm Community blogs
Arm Community blogs
SoC Design and Simulation blog Using Portable Stimulus in the Arm World: Creating bare-metal SW coherency scenarios
  • Blogs
  • Mentions
  • Sub-Groups
  • Tags
  • Jump...
  • Cancel
More blogs in Arm Community blogs
  • AI and ML blog

  • Announcements

  • Architectures and Processors blog

  • Automotive blog

  • Embedded blog

  • Graphics, Gaming, and VR blog

  • High Performance Computing (HPC) blog

  • Infrastructure Solutions blog

  • Internet of Things (IoT) blog

  • Operating Systems blog

  • SoC Design and Simulation blog

  • Tools, Software and IDEs blog

Tags
  • Cadence Design Systems
  • Cache coherency
  • Armv8
  • DynamIQ
  • Armv8.2-M
  • Baremetal
Actions
  • RSS
  • More
  • Cancel
Related blog posts
Related forum threads

Using Portable Stimulus in the Arm World: Creating bare-metal SW coherency scenarios

Nick
Nick
September 18, 2017

In my last blog (Navigating SoC Verification with Perspec Portable Stimulus) I introduced the Accellera Portable Stimulus Standard (PSS) and how Cadence Perspec System Verifier supports the creation of portable baremetal Arm SoC integration tests using the Perspec PSLib for multicore Armv8 and Armv8.2 architectures. In this blog we will dig a little deeper into what PSLib supports and how it can be used Out-of-the-box to create a rich variety of coherent and I/O coherent scenarios.

It is worth spending a few minutes just revisiting cache and it’s place across the hierarchy of Arm IP. With the advent of DynamIQ, Arm’s new cluster microarchitecture, there are a multitude of places where cache lives:- within each core, usually called L1 cache, this is typically the smallest and fastest cache in the system, shared between cores of like type, usually called L2, shared across the cluster, called L3 and shared across the clusters, which may be called Last Level Cache (LLC) or System Cache, typically the slowest but largest cache in the system.

Example of structure using Arm DynamIQ

There are any number of architectural options available when constructing such systems and therefore some or all these caches may be present in your target system. Interestingly with the announcement of the new CCIX protocol we will soon see Arm-based SoCs which also share cache from chip-to-chip as well. 

Given the number of options and the need to integrate these complex compute subsystems into bigger SoCs which may also utilize I/O Coherency to optimize the system performance for high speed I/O such as PCIExpress, it is essential that the caching is fully exercised before committing to Silicon as a bug in the integration of the SoC could prove disastrous.

To address this growing complex challenge Cadence developed a rich set of portable actions which comprise the Perspec PSLib, they are readily assembled into target scenarios with code then being generated at the push of a button. In fact for two common cache testing scenarios, the library provides a complete scenario ready-made.

False Sharing

I will now explain in a little more detail the “False Sharing” scenario, look for my next blog coming soon which will detail the “True Sharing” scenario. 

False Sharing is a situation where cache lines are being used by a number of cores, and hence the system considers them shared data, but in fact the cores are using exclusively different parts of the cache line and therefore do not actually share data with each other.

The figure below shows by colour which core is using which bytes of the 64 byte cache line. We can immediately see that within each cache line, regions of data are exclusively used by one core only (one colour). This is what we mean by False Sharing.

False Sharing example

Also notice the regions are not of regular size, but obviously a whole number of bytes. The permutations of False Sharing situations are enormous especially when considering the hierarchical cache architecture permutations. Creating baremetal SW scenarios to cover a good number of permutations using hand-written code would be a significant challenge.

The PSLib provides a ready-made scenario to create such scenarios with a number of degrees of freedom, the Perspec generator provides multiple tests generated from one single use-case greatly increasing test writer productivity. The beauty of the Portable Stimulus model is that these scenarios can be intermixed with your own scenarios creating stress tests that are uniquely targeting your SoC, for example maybe you want to mix cache stress with power management, this is readily achieved with Perspec. 

Very easily, complex multithreaded uses-cases can be created for any number of cores with randomly selected regions of shared memory, see the example below.

Perspec is able to generate a huge number of specific test cases, the diagram above is one specific solution, through powerful constraint solver technology and the PSS model which abstractly defines data dependency independent of action ordering. This brings huge productivity to the test writer as one test can create hundreds of possible solutions, the user can pick one and then run it on the SoC they are working on.

In the next blog I will dig a little deeper into how tests are created and how users can use coverage to decide which test or tests they want to run.

Anonymous
SoC Design and Simulation blog
  • Arm Virtual Platform co-simulation solution accelerates SoC verification

    Daniel Owens
    Daniel Owens
    Avery Design Systems’ co-simulation design verification solution that integrates SystemC-based Arm virtual platforms with a SystemVerilog environment.
    • December 6, 2022
  • IP exchange and Cycle Models end-of-life update

    Gemma Platt
    Gemma Platt
    Arm Cycle Models and Arm IP Exchange are now End-of-Life, understand what this means to you.
    • May 25, 2022
  • Accelerate IP Selection with the New Arm IP Explorer

    Zach Lasiuk
    Zach Lasiuk
    The newly announced Arm IP Explorer platform represents a step-change in efficiency for the IP selection process when defining a custom System on Chip (SoC).
    • May 4, 2022