Hi,
I was wondering if there are any documentations on how to analyze the ARM pipeline. I have access to thunderx2 nodes, and i'd like to make bottleneck analysis like can be done on intel chips. i can get the formulas to get compute the different metrics for a skylake here https://github.com/andikleen/pmu-tools/blob/master/skl_client_ratios.py. i checked the regular arm docs, ARM Architecture Reference Manual ARMv8, for ARMv8-A architecture profile, and Programmer’s Guide for ARMv8-A and i did not find any information.
thanks,
thanks for the links. i'll definitely take a look. i have a number of openmp benchmarks and i'm trying to understand bottlenecks on the thunderx2. i know my way around perf, so i am interested in trying to figure out which perf events to focus on. i'll take a look at the trial version
yectli
saw the video. thanks. i was hoping to find a resource like this,
https://github.com/torvalds/linux/tree/master/tools/perf/pmu-events/arch/x86
here you can get a breakdown and information on how the top-down is computed. i like to understand what is going on. when i check the aarch64, it is empty
https://github.com/torvalds/linux/tree/master/tools/perf/pmu-events/arch/arm64/cavium/thunderx2
is there a white paper, document or something url where i could get more info on the thunderx2 top-down approach? i'm trying to learn more about the pipeline etc
thanks
Hi Yectli,
Thanks for clarifying. This is an interesting query and I do not believe such a document exists today. I have forwarded your request to my colleague Florent who created the webinar and is one of our experts on the subject, we will do our best to assist you.
Best regards,
Patrick
if you share the formulas in this forum, or on some arm document, people can write the url in the reference portion of the white paper or other publications. that way people can share knowledge and make it easier to profile on aarch64. that is a win for everyone and ARM too. formulas for the main categories and subcategories. people want to see why their code is core bound, or memory bound. whether it is l1bound, l2 bound, external memory bound, etc.