In Section C1.3 Channel Overview of the AMBA_AXI_and_ACE protocol specifications, It is mentioned under "Store operations where the cache line is already cached" as :
The initiating master component requests a unique copy of the cache line by issuing a CleanUniquetransaction on the read address channel. This removes all other copies of the cache line and writes any dirtycopy to main memory.
Consider that our initiating master has a clean, shared copy. There is another master having a dirty, shared copy.
Now the initiating master issues a CleanUnique transaction on the Read Address Channel. Since the snooped master has had a dirty copy, the interconnect constructs a transaction to write the cacheline to the main memory, and provides a response to the initiating master.
Now at this point in time, the initiating master has a copy that is no more clean, since the copy it has with itself is modified relative to the main memory; and the previously dirty cacheline was not provided to the initiating master.
The next step mentioned is that the master performs a store and uses the RACK signal to indicate that the transaction has been completed.
This seems ambiguous since the initiating master performed a store even when it's copy of the cacheline, though unique ; wasn't clean.
Am I missing something?
HI, I have a doubt in this. If there is a store hit in cache, a cleanunique transaction is issued on the Read Channels. But at the same time the processor send a load request for the same address. It will be a load hit. so how that load request be handled?
The behaviour depends on the nature of the cpu pipeline and of the store-to-load forwarding.
The store, which is waiting on acquiring the rights to the cache-line, is likely held in a store buffer which the cpu (load-store unit) can possibly search to satisfy a partially- or completely- contained program-order-later load.
If the load is completely satisfied by the store, load transactions can be avoided.
Or, the load waits until the store is drained into the cache, and then proceeds in a manner usual for loads (reading from the line and/or issuing appropriate transactions).
If the question is instead about two transactions to the same cache-line from one ACE master, where the first is a store and the second is a load, it seems reasonable to assume that the master will serialize them (Section: "Definition of the ordering model" of the axi4 spec) to maintain the ordering, if nothing else does.
Thanks for replying.
I have two doubts here:
1. As you specified, the load request should wait until the store is drained into the cache, but where should this load request be stored in cache.
2. If the cache has a store hit and the MakeUnique transaction is not yet completed, the cache received a snoop read or invalidation request for the same line. how should the cache handle this scenario?
(1) The load waits by the virtue of it being in the pipeline of the cpu, and of waiting for responses from the memory system. The load-store functional unit (lsu), of a typical pipelined cpu, does the communicating with, commanding of, and waiting for, the memory system, on behalf of the load operation.
The functional relation between the lsu, the store buffer (stb) and the cache, can be seen, for e.g., here.
LSU commands the buffer to drain, while the load (and other operations too) waits within the queues that the lsu maintains.
If the cpu implements store-to-load forwarding, the load can indeed be completely satisfied by its value, and the memory ordering rules do not prohibit such forwarding, then the execution of the load does not need to wait.
(2) The section "Overlapping MakeUnique" in axi4-ace spec describes the situation: the cache must invalidate (if applicable) the line, and wait for MakeUnique to complete before storing the cache line.
Thank you for replying. I understand now. I will surely contact you in case I find some other problem in future.
In case of readunique transaction cache line is copied into the initiating master's cache(whether it is clean or dirty) and invalidated in snooped master's cache and then store operaation is performed in initiating master's cache line.
In case of cleanunique transaction cache line is copied to main memory if it is dirty(and invalidate) and then store operation performed in initiating master's cache.
Why do we need extra write to memory in case of cleanunique transaction?
Now the initiating master issues a CleanUnique transaction o omegle discord xender n the Read Address Channel. Since the snooped master has had a dirty copy, the interconnect constructs a transaction to write the cacheline to the main memory, and provides a response to the initiating master.
That is correct , but why it is required ? Like in readunique case , we also can send copy of cache line directly to initiating master itself right?