Good day everyone. I am new to the forums so I apologise in advance if I am posting in the wrong forum. I am looking to profile Android applications and collect hardware architectural counters (cache hit/miss rates, branch mispredictions, etc... as detailed as I can really) on newer mobile big.Little architecture (this is for my research project as school). I am looking to purchase a development board based on Qualcomm Snapdragon SOC:Snapdragon 888 Mobile Hardware Development Kit (https://developer.qualcomm.com/hardware/snapdragon-888-hdk). I have a question about profiling support for this SOC:
This SOC has what Qualcomm calls Kryo 680 CPU, which consists of: 1 x Kryo 680 Prime core (based on Arm Cortex-X1), 3 x Kryo 680 Gold core (based on Arm Cortex-A78), and 4 x Kryo 680 Silver core (based on Arm Cortex-A55). It says that these cores are based on Arm processors, and I was wondering if Arm-provided tools would be able to profile these cores. In particular I am looking at profiling the cores running Android 11 with Streamline Profiler of Arm Mobile Studio and later potentially Arm Development Studio. So my question is: will I be able to profile Android apps running on these cores with Streamline Profiler?
Another quick question related more to debugging hardware. The expansion board for this SOC has a 20 pin JTAG port. I know that JTAG is used for debugging and collecting traces. This is what I would be using it for. However, it is kind of throwing me off that the JTAG is on the display expansion and not on the board itself. I am new to development boards and so I am wondering if JTAG is also used for anything graphics/display/sensor related or is it just a design decision made by Qualcomm to put JTAG connector on the display expansion? Does anyone have any experience with those Qualcomm Development Kits who could shed some light on this?
I am attaching the pdf with information about 888 Development kit in question.snapdragon-888-mobile-hdk-product-brief_87-pu790-1 (2).pdf
Thank you very much,
Pavel.
Hi Pavel,
Firstly to differentiate Arm Mobile Studio vs Arm Development Studio use case, at a high level - Mobile Studio works with final devices (see below list), whereas Development Studio is more generally used with development systems and similar.
https://developer.arm.com/tools-and-software/graphics-and-gaming/arm-mobile-studio/support/supported-devices
In principal, yes, Streamline should work with the device you mention above. You would require a Development Studio Platinum Edition license:
https://developer.arm.com/tools-and-software/embedded/arm-development-studio/learn/specs/supported-processor-cores
I say in principal, as there may be system level restrictions in these devices that I am not aware of. This may be beyond scope of your project (from both technical and budgetary reasons).
I cannot explicitly comment on why the JTAG port is on a daughter card, but I suspect it is just a design decision.
Note that you may get better responses to your question by posting to the Qualcomm specific forums:
https://developer.qualcomm.com/forum
Hi Ronan,
Thank you, this helps a lot. Another few questions if I may:
Even though Arm Mobile Studio is targeted at final devices, there should be no problem profiling a development platform with it right? I am asking this in case I will not be able to find a modern final device with newer Arm cores and without custom vendor cores, which brings me to the next question:
On Arm Mobile Studio supported devices page (https://developer.arm.com/tools-and-software/graphics-and-gaming/arm-mobile-studio/support/supported-devices) it says that Arm Mobile Studio works with Samsung Galaxy S20 phone. It also says that Galaxy S20 has two custom cores, those are Samsung Exynos M5 custom cores which use ARMv8.2 ISA. Do you know if these custom cores can be profiled with with Arm Mobile Studio or not? My concern is that because they are custom cores, there will be system restrictions on profiling them.
Best Regards,
Hi Pavel, For the devices that are listed as supported we will support the custom cores, but you will only generally get the architectural PMU counters unless the vendor has provided us with the microarchitectural counter definitions. One caveat is that there is normally a little delay for Mobile Studio on us picking up the latest Cortex cores and custom cores in the latest devices. We'll be adding support for Cortex-X1 in the upcoming Mobile Studio 2021.0 release, and the Kryo cores in the 2021.1 release in the early summer.
Cheers, Pete
I see. Thank you Peter!
Where would I find a list of architectural PMU counters that I would have access to? Is the list more or less similar to this: (https://developer.arm.com/documentation/ddi0363/e/events-and-performance-monitor/about-the-events) I know it is for a different processor, but still. What will I not have access to? Can you give an example of such counters. Architecture and Micro-architecture seem to be a little overloaded terms.
Peter Harris said:a little delay for Mobile Studio on us picking up the latest Cortex cores and custom cores in the latest devices.
That is totally understandable. So you anticipate fully supporting profiling Kryo cores from Qualcomm with Arm Mobile Studio? Does that mean I will be able to profile the SnapDragon 888 Development board (developer.qualcomm.com/.../snapdragon-888-hdk) with Mobile Studio? Again, the reason I am choosing to profile the board as a final device with Mobile Studio is because it has latest Qualcomm core and you support profiling it. I know this is a little backwards. Are earlier Kryo cores supported now? I am mainly wondering about Kryo 585 (en.wikipedia.org/.../Kryo
Does Qualcomm provide the micro-architectural counter definitions for deeper profiling? If you are allowed to answer this question of course.
Hi Pavel, For the architectural counters see the Arm Architecture Reference Manual for the Armv8-A architecture profile. Chapter D7 documents the PMU, and D7.10 documents the events themselves.
https://developer.arm.com/documentation/ddi0487/ga
For the full event list for a specific Cortex core, including microarchitecture events, see the that CPU model's Technical Reference Manual.
PavelGolikov said:So you anticipate fully supporting profiling Kryo cores from Qualcomm with Arm Mobile Studio? Does that mean I will be able to profile the SnapDragon 888 Development board
We aim to support them, but we can't test every board and there are some dependencies on Linux Perf being correctly integrated and configured in that board's BSP. So "yes in principle, (at least when we add CPU support - see below), but we've not tested it".
PavelGolikov said:Are earlier Kryo cores supported now? I am mainly wondering about Kryo 585
At the moment the supported list is:
* https://github.com/ARM-software/gator/blob/master/daemon/pmus.xml#L110
So no Kryo 500 series at the moment - our aim is to support 56x and 58x in our mid-year release.
PavelGolikov said:Does Qualcomm provide the micro-architectural counter definitions for deeper profiling?
I can't really cover what Qualcomm provides - I honestly don't know. The architectural counters must be present, but beyond that I have no idea ...
I see. Just to be clear, even though the XML you posted lists support for Kryo 460/485/495 Gold, there is no support for PMU microarchitectural counters for these cores, only architectural ones. Do I understand correctly? Or are Kryo cores not considered proprietary like Samsung Exynos?
Hi Pavel, For cores that are entirely designed and built by one of our architecture licensees, we can only support architectural counters unless the partner contributes counter definitions for their custom microarchitecture counters.
For cores that are based on an Arm Cortex core and customized by an architecture licensee, we can support all of the microarchitecture counters provided by the ancestor Cortex CPU core. The XML linked above shows which counter sets are aliases of those provided by a Cortex core; these will include microarchitecture counters.
HTH, Pete
Thank you very much Peter! This is great news :)
Mobile Studio 2021.0 is now available with the Cortex X1 support.
Kind regards, Pete
Thank you Peter! That's great news!
Hi everyone. I am also interested in profiling applications using the same development board. While using Streamline I see the following warnings.
Warning 1: No Perf PMUs detectedCould not detect any Perf PMUs in /sys/bus/event_source/devices/ but the system contains recognised CPUs. The system may not support perf hardware counters. Check CONFIG_HW_PERF_EVENTS is set and that the PMU is configured in the target device tree.
Warning 2: Profiling SourceUsing perf API for primary data source
Warning 3: Atrace is disabledUnable to locate notify.dex
Should I be worried about these? How does this affect PMU stats capture?
Thanks,
Victor
Edit: Same error appears here but if I'm not mistaken the question is about Linux, whereas I'm interested in Android.
Some more info.
CONFIG_HW_PERF_EVENTS
This is a capture with some CPU stats (Cycles and Cache Accesses). All of them have values.
This is another capture where Thermal Throttling Stats have been added in the list. All the CPU stats are zero now. This happens when I add other stats as well (e.g., CPU Clock from Perf Software).
Is this behaviour expected?
Hi torvik - leaving aside the warning for a second, is the main issue that counter values appear to disappear when certain other counters are selected?Assuming that is the case, can you tell me what combinations seem to work / not work? If you stop the capture and let the analysis complete, do you see the missing data now?What device is this and is it stock or rooted?The warnings are most likely because the device has something like /sys/bus/event_source/devices/armv8_pmu generic entry rather than per cpu specific entries and usually can be ignored.
/sys/bus/event_source/devices/armv8_pmu
Hi Ben.
Ben Gainey said:Assuming that is the case, can you tell me what combinations seem to work / not work?
Some counters are always zero. There aren't any concrete combinations that always work or always don't work. It seemed that adding thermal stats or perf stats resulted in the other counters being zero. However, as you can see in this screenshot having thermal stats and CPU stats works fine.
Ben Gainey said:If you stop the capture and let the analysis complete, do you see the missing data now?
I tried again with more stats. Again some of them are zero, even if I let the analysis finish. (Screenshot)
Ben Gainey said:What device is this and is it stock or rooted?
Same as the thread starter. Qualcomm Snapdragon 888 Mobile Hardware Development Kit.
I also get the following error after the analysis is done.
Arm Processor PMU counters not configuredArm Processor PMU event counters have been detected, however the event counters are reading zeroes. Event counters include those counters listed in the counter configuration options dialog under the core name but exclude the cycle counter (Clock:Cycles) as it is controlled by a dedicated counter. It is possible that the PMU configuration bit DBGEN has not been enabled, and counter values subsequently will always read as zero. Alternatively, on it may be that the target kernel has not been configured to enable perf hardware counters. To remedy, please update your firmware or Linux kernel to enable DBGEN and/or CONFIG_HW_PERF_EVENTS. If a device tree is used it is also possible that the PMU entries are missing in the device tree. If this is the case please contact your vendor for a patch.
Thanks, can you open a support case via the link on https://developer.arm.com/All%20Support%20Services and attach a capture with the missing counters so that i can examine the data inside.You can menu-click on a capture in the list of captures in Streamline, and use the Export option from the popup menu to create a zip file of the capture containing just the relevant data.