In case of readunique transaction cache line is copied into the initiating master's cache(whether it is clean or dirty) and invalidated in snooped master's cache and then store operaation is performed in initiating master's cache line.
In case of cleanunique transaction cache line is copied to main memory if it is dirty(and invalidate) and then store operation performed in initiating master's cache.
Why do we need extra write to memory in case of cleanunique transaction?
Maybe I am replying to this question after a long time. Anyway, according to my understanding, ReadUnique and CleanUnique transactions are almost the same. In addition, CleanUnique is more efficient compared to ReadUnique. Here is the explanation.
ReadUnique: is used by a master prior to performing a partial line write, i,e, it is updating only some bytes in the line. To do this, it needs a copy of the line from memory, which is obtained after the snooped master has updated the recent line in the memory.
CleanUnique: is the same as ReadUnique, where the master already has a copy of the line in either SharedClean or SharedDirty state. Note that only one master can have a copy of the cache line in SharedDirty state. Therefore, performing CleanUnique will save the additional memory read operation performed in ReadUnique transaction.
In conclusion, I am afraid that your question may be not valid. For more reading, you can look into the Cache Coherency white paper by ARM
Cheers,
Dr. Ray