We are running a survey to help us improve the experience for all of our members. If you see the survey appear, please take the time to tell us about your experience if you can.
I would like to know how what perf counters describe remote memory loads from other cores or nodes. The Arm Cortex A-Series, Programmer's Guide for Arm v8-a (https://cs140e.sergio.bz/docs/ARMv8-A-Programmer-Guide.pdf, would be nice to know the official link to that document too)says (at 11-7):
> For multi-core and multi-cluster systems, before performing a load from external memory, the caches of L2 or L1 caches of cores within the cluster or of other clusters might also be checked
What perf counters describe such loads?
Thank you so much for the reply and the explanation, vstehle .