As we all know, ICC_SGI1R_EL1 is used to produce another core's interrupt.
I am a software engineer.
My question is what kind of memory barrier should be followed by writes ICC_SGI1R_EL1?
This question related to the implementation of the instruction "MSR ICC_SGI1R_EL1, x".
If the instruction actually write the memory mapped register, should we use data memory barrier?
The GIC 3.0 manual is the only one with this hint. Maybe one reason you did notice this dramatic difference in the latency between 2.0 and 3.0?