I was wondering if there are any documentations on how to analyze the ARM pipeline. I have access to thunderx2 nodes, and i'd like to make bottleneck analysis like can be done on intel chips. i can get the formulas to get compute the different metrics for a skylake here https://github.com/andikleen/pmu-tools/blob/master/skl_client_ratios.py. i checked the regular arm docs, ARM Architecture Reference Manual ARMv8, for ARMv8-A architecture profile, and Programmer’s Guide for ARMv8-A and i did not find any information.
Thanks for your query. I would be curious to learn a bit more about your application to make sure I point you in the right direction (for instance the programming language you use, the type of application and paradigms you rely on).
If you are working with C/C++, Fortran or Python codes, Arm Forge (and in particular Arm MAP) may be suitable for you. We did a webinar recently about this very topic. I encourage you to have a look here: https://www.brighttalk.com/webcast/17792/384060/top-down-performance-analysis
If you would like to give it a go, you can download a trial version of Forge on this page: https://developer.arm.com/tools-and-software/server-and-hpc/trials
I hope this helps. Let me know if I can be of further assistance.
the application is a set of benchmarks, SPEC OMP2012. the goal is to understand the underlying architecture, bottlenecks, etc when using different computational kernel
View all questions in HPC forum