This discussion has been locked.

You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Why do we need atomicity in ARM Architecture?

abhipandya over 6 years ago

How does atomicity work with the memory accesses?

Top replies

daith over 6 years ago +1 verified

That sounds like two or three questions to me. How does it work architecturally - what does a conforming implementation have to ensure. How would it be implemented in a system that wanted to exploit...

Parents

+1 daith over 6 years ago

That sounds like two or three questions to me.

How does it work architecturally - what does a conforming implementation have to ensure.

How would it be implemented in a system that wanted to exploit it well.

What use are the instructions compared to what is already there in making a system work better..

My understanding is:

Architecturally they should act is if they are implemented using a load store exclusive loop. Except that for CAS a monitor might not be cleared if the value isn't changed and the performance counters might be different from expected and the store ones don't tell you what was there originally and are need not be counted as doing a load for memory barrier purposes.

In practice they can be performed by exporting the operation to a further out point where the memory can be held exclusively and the operation done atomically there, and any caches in between if any cleared. I guess one could even send it off to another cache associated with another PE that held the data exclusively

In a system that wanted to exploit it well, I guess there are problems with cache line boundaries one would have to be careful about when a structure contains items one might want to apply atomic operations to and other non-atomic data, but overall even ignoring that having atomic operations cuts down on the conflicts and data movement that using the load store exclusive operations involve. This is especially important in large systems with lots of processors.. It also helps avoid problems with debugging, nasty things have to be done in debuggers to cope with exclusive monitor loops! :) Basically they re cleaner and faster.
Cancel
Up +1 Down

Cancel

Reply

+1 daith over 6 years ago

That sounds like two or three questions to me.

How does it work architecturally - what does a conforming implementation have to ensure.

How would it be implemented in a system that wanted to exploit it well.

What use are the instructions compared to what is already there in making a system work better..

My understanding is:

Architecturally they should act is if they are implemented using a load store exclusive loop. Except that for CAS a monitor might not be cleared if the value isn't changed and the performance counters might be different from expected and the store ones don't tell you what was there originally and are need not be counted as doing a load for memory barrier purposes.

In practice they can be performed by exporting the operation to a further out point where the memory can be held exclusively and the operation done atomically there, and any caches in between if any cleared. I guess one could even send it off to another cache associated with another PE that held the data exclusively

In a system that wanted to exploit it well, I guess there are problems with cache line boundaries one would have to be careful about when a structure contains items one might want to apply atomic operations to and other non-atomic data, but overall even ignoring that having atomic operations cuts down on the conflicts and data movement that using the load store exclusive operations involve. This is especially important in large systems with lots of processors.. It also helps avoid problems with debugging, nasty things have to be done in debuggers to cope with exclusive monitor loops! :) Basically they re cleaner and faster.
Cancel
Up +1 Down

Cancel

Children

0 abhipandya over 6 years ago in reply to daith

Can you give an example or perhaps assembly code.
Cancel
Up 0 Down

Cancel
0 daith over 6 years ago in reply to abhipandya

I'm thinking I may have misunderstood the question as asking why the atomic instructions in ARMv8.1 are wanted when their work can be done using the acquire/release and exclusive load/store instructions on the base architecture. If you are asking instead why it is extremely desirable to be able to support atomic operations irrespective of the architecture. then here's an introduction and WIkipedia has lots of more detailed entries if one searches on the various terms

Atomic Operations in Hardware
Cancel
Up 0 Down

Cancel