A working example of how to use the herd7 Memory Model Tool

February 18, 2020

7 minute read time.

This blog is a working example of how to use the Memory Model Tool. It provides an opportunity to experiment with the model and develop an intuitive understanding of how it works. The information is useful to software developers, compiler writers, and verification engineers. This is the first installment in a series of blogs about the Memory Model Tool. You can access the tool through the Arm Developer website and at the bottom of this blog.

1. First steps

Here we’re going to learn about the essential concepts to use the tool.

1.1. Litmus tests

Litmus tests are small snippets of assembly or pseudo-assembly code that allow us to examine the behavior of a chip (see http://diy.inria.fr/doc/litmus.html), or a model (see http://diy.inria.fr/doc/herd.html).

The following is our first litmus test. In this test, called MP (short for message passing), two processors P0 and P1 communicate VIA two shared memory locations x and y, both initialized to 0:

AArch64 MP 
{
0:X1=x; 0:X3=y;
1:X1=y; 1:X3=x;
}
 P0          | P1          ;
 MOV W0,#1   | LDR W0,[X1] ;
 STR W0,[X1] | LDR W2,[X3] ;
 MOV W2,#1   |             ;
 STR W2,[X3] |             ;
exists (1:X0=1 /\ 1:X2=0)

On the left, the thread P0 writes 1 to memory location x, and 1 to memory location y. On the right, the thread P1 reads from y and places the result into register W0 and reads from x and places the result into register W2. The registers W0 and W2 are private to P1.

Essentially, P0 writes a message in x, then sets up a flag in y, so that when P1 sees the flag (via its read from y), it can read the message in x.

At the bottom of the test, we ask “is there an execution of this test such that register W0 contains the value 1 and register W2 contains the value 0?”.

What do you think? Do you think such an execution is possible?

1.2. The herd7 tool

Now, let us try it out. The herd7 tool lets us simulate a model, and run litmus tests against that model, to determine which executions are allowed by this model. So let us go to https://developer.arm.com/herd7.

In the “litmus test” box, find the file mp.litmus in the drop box. Let us click on the play button.

1.2.1. Histograms

In the “histogram” box, we see the following result:

Test MP Allowed
States 4
1:X1=0; 1:X2=0;
1:X1=0; 1:X2=1;
1:X1=1; 1:X2=0;
1:X1=1; 1:X2=1;
Ok

Note that we see the result 1:X0=1; 1:X2=0; which we asked about in our test. This result is reachable by our test, and therefore the tool says Ok; otherwise it would say No.

1.2.2. Executions

In the “executions” box, we see that this test can have four different executions. Note that you can download the drawings if you wish (there's a button at the bottom-right corner).

Memory model - executions

(Figure 1)

In all of them we see four events: writes to x or y, of the shape Wx=1, which means that value 1 was written to the memory location x, and reads from x or y, of the shape Ry=0, meaning that memory location y was read, and the value read was 0. We also see three types of relations over these events: po arrows, rf arrows, and fr arrows.

Program order

The po arrows represent the program order. The program order intuitively corresponds to the order in which the events occur on a thread. For example, in our test, on P0, the write of x appears in program order before the write of y. Therefore, the two corresponding events in Fig. 1 are related by po. Similarly, the read of y on P1 appears in our test in program order before the read of x; therefore, the two corresponding events in Fig. 1 are related by po.

Note also that po is transitive. In Fig. 1, suppose there was an extra event e occurring on P1 with a po arrow from d to e. In this case, there would also be a po arrow from c to e. That is, e is after c in the program order.

Read-from

The rf (read-from) arrows depict who reads from where; more precisely, for each read of a certain location, it finds a unique write of the same location and value.

Coherence order

In most models I know of, there is a notion of a coherence order. Intuitively it is a history of all writes to a given memory location x, that represents the order in which writes to x hit x. If you are sitting in memory location x, looking at writes falling onto you, and recording their values as they come by, you would get the coherence order. More precisely, the coherence order co is a total order overwrites to a given memory location.

From-read

Using the read-from and coherence relations, we can build a relation called from-read:

Memory model - from read

(Figure 2)

Intuitively, a from-read arrow starts from a read of a given location x, for example the read r just above, and points to all the writes that overwrite the value this read has taken. In the drawing above, the read r takes its value from the write w0, as shown by the rf arrow between them. The write w0 is then overwritten by the write w1, as shown by the co arrow between them. Therefore, there is an fr arrow between r and w1, as w1 overwrites the write w0 from which r reads. The write w1 is then overwritten by the write w2, as shown by the co arrow between them. There is an fr arrow between r and w2 too.

1.2.3. Reading the output of the tool

Let’s go through each state and execution one by one.

State 1:X0=1; 1:X2=1;

This is the state where P1 does see that the flag y has been set to 1 and sees that the data x has been updated to 1. In other words, the read of y on P1 reads 1 from the update of y by P0, and the read of x on P1 reads 1 from the update of x by P0. We represent this as follows:

Memory model - MP4

(Figure 3)

a read-from (rf) arrow between the write of x on P0 and the read of x on P1
a read-from (rf) arrow between the write of y on P0 and the read of y on P1.

State 1:X0=0; 1:X2=1;

This is the state where P1 does not see that the flag y has been set to 1, but sees that the data x has been updated to 1. In other words, the read of y on P1 reads 0 from the initial state, and the read of x on P1 reads 1 from the update of x by P0. We represent this as follows:

Memory model - MP2

(Figure 4)

A read-from (rf) arrow between the write of x on P0 and the read of x on P1
A from-read (fr) arrow between the read of y on P1 and the write of x on P0; the reason behind this fr arrow is as follows:
- The read of y on P1 reads from the initial state
- The initial state for y is a notional write of y which precedes the write of y on P0 in the coherence order
- Therefore, the value read by the read of y on P1 has been overwritten by the update of y on P0, hence the read of y on P1 is in from-read before
  the write of y on P0.

State 1:X0=0; 1:X2=0;

This is the state where P1's reads do not see the updates by P0, and both read from the initial state. We represent this as follows:

Memory model - MP1

(Figure 5)

A from-read (fr) arrow between the read of x on P1 and the write of x on P0
A from-read (fr) arrow between the read of y on P1 and the write of y on P0.

The reason behind each fr arrow is as follows:

The read on P1 reads from the initial state
The initial state is a notional write which precedes the update of the same variable on P0 in the coherence order
Therefore, the value read by the read on P1 has been overwritten by the update on P0, hence the read on P1 is in from-read before the write on P0.

State 1:X0=1; 1:X2=0;

This is the state where P1 sees that the flag y has been set to 1 and does not see that the data x has been updated. In other words, the read of y on P1 reads 1 from the update of y on P0, and the read of x on P1 reads 0 from the initial state. We represent this as follows:

Memory model - MP3

(Figure 6)

A read-from (rf) arrow between the write of y on P0 and the read of y on P1
A from-read (fr) arrow between the read of x on P1 and the write of x on P0

The reason behind this fr arrow is as follows:

The read of x on P1 reads from the initial state
The initial state for x is a notional write of x which precedes the update of x on P0 in the coherence order
Therefore, the value read by the read of x on P1 has been overwritten by the update of x on P0, hence the read of x on P1 is in from-read before the write of x on P0.

Access the Memory Model Tool via the Arm Developer website:

Memory Model Tool

zan over 5 years ago

Hi Jade,

Is there anywhere we can download ARM memory model tool instead of just herdtools? Can we run the litmus tests in batch mode the similar way as to generate them?

Thank you very much.

Zan
- Cancel
- Up 0 Down
- Reply
- More
- Cancel
Hernán Ponce de León over 5 years ago

Looking forward to read the whole series of blogs about the Memory Model Tool
- Cancel
- Up 0 Down
- Reply
- More
- Cancel

Architectures and Processors blog

Scalable Matrix Extension: Expanding the Arm Intrinsics Search Engine

Chris Walsh

Arm is pleased to announce that the Arm Intrinsics Search Engine has been updated to include the Scalable Matrix Extension (SME) intrinsics, including both SME and SME2 intrinsics.
- October 3, 2025
Arm A-Profile Architecture developments 2025

Martin Weidmann

Each year, Arm publishes updates to the A-Profile architecture alongside full Instruction Set and System Register documentation. In 2025, the update is Armv9.7-A.
- October 2, 2025
When a barrier does not block: The pitfalls of partial order

Wathsala Vithanage

Acquire fences aren’t always enough. See how LDAPR exposed unsafe interleavings and what we did to patch the problem.
- September 15, 2025

AI blog

Announcements

Architectures and Processors blog

Automotive blog

Embedded and Microcontrollers blog

Internet of Things (IoT) blog

Laptops and Desktops blog

Mobile, Graphics, and Gaming blog

Operating Systems blog

Servers and Cloud Computing blog

SoC Design and Simulation blog

Tools, Software and IDEs blog