The manual tells that I can use ETM in both self-host debugging and external debugging. With DStream and DS-5, the ETM works well. However, i find that i can not modify the tracing registers in DS-5, the command always leads to "verify error on memory operation".
Moreover, can i use ETM without JTAG and DStream? I mean i want to enable the tracing and read the ETB from the EL3, i guess it should be feasible as it supports self-host debugging. However, after i plug out the JTAG, I can not read the memory region CSS_DEVICE in trusted firmware. I checked the manual, all the debugging related (CTI, ETM, trace) components are mapped to this region (0x20000000-0x2e000000), but the read operation to this region in EL3 leads to stuck. I am really confused about it. I have checked the MMU page tables and even tried to disable MMU, but i still got stuck.
Really appreciate for any help!
-------------------------------------------------
Edit 09/02/2016: Not only the ETM, if i try to read from the coresight debug and trace memory region (0x20000000-0x23350000) in EL3 without plugging in JTAG, the system just stuck there. I think it is not caused by memory mapping as i got the same result even if i temporarily disabled the MMU. Did i miss something special?
SCP is definitely old. The Set Device Power State was supported but lots of issues were
fixed since last December. Since I am not complete aware of Linaro releases, I would
suggest(just for sake of getting the firmware and other setup right), use the latest mainline
(yet to be tagged v4.9-rc1, as there are few coresight driver bug fixes) with latest 16.09
firmware release.
Regards,
Sudeep
Hi Matt,
Thanks so much for your help!!! I manually checked the source code of 15.09, 16.06, 16.09 release of Linaro, however, none of them contains the patches you mentioned.
But you remind me that i made some mistake in the previous discussion. In the previous discussion, i mentioned that i tried to check the status of cluster power domain and cpu power domain. However, now i know it is completely a fault as they belongs to the "CSS Power State" but not the "Device Power State". So I tried to send the "Set Device Power State" command to SCP as you suggested, and finally the memory region is available now.
Really appreciate for your help for so long time!!!
Best Regards,
Zhenyu
Zhenyu,
Regardless of the SCP firmware version, I couldn't find a Juno reference release on Linaro that actually has the power domains defined in the device tree. Without these it's obvious that there can be no instruction to the driver that a particular device could be powered on or not. You can find some documentation on the SCPI protocol and the Juno implementation on Infocenter in the usual place - it's alongside the Juno documentation.
What you seem to require is some ability for some driver to either directly or indirectly send a "Set Device Power State" request for domain 0 (DEBUGSYS) - once this takes effect, you should be able to use the ETM.
The patches that add that power domain information to the DT, we have already mentioned in this thread. Patches that implement the power domain support, and wire it to Linux 'runtime PM' infrastructure are easy to find from that point:
arm64: Kconfig: select PM{,_GENERIC_DOMAINS} for ARCH_VEXPRESS
firmware: arm_scpi: add support for device power state management
Documentation: add DT bindings for ARM SCPI power domains
firmware: scpi: add device power domain support using genpd
arm64: dts: juno: add coresight support
arm64: dts: juno: add SCPI power domains for device power management
If your kernel doesn't include one or more of these patches (or any dependencies) then you'll have to do some porting. I'd suggest you give a Linaro mailing list a nudge, since I would expect this should be in the Android tree by now.
After that point, I'm afraid I don't have any specific advice on the matter.
Ta,
Matt
Hi Sudeep,
Sorry for the late response. As the 16.06 release did not solve my issue, i didn't transplant my work to the 16.06 release. So i am now
again working on the 15.09 release. While the linux is booting, it shows something like,
scpi_protocol scpi: SCP Protocol 1.0 Firmware 1.9.0 version
Is this the version your need?
Regarding to the "ioremap", i just use it to perform a easy experiment. In previous experiments, actually i try to access the ETM in EL3 directly with memory-mapping interface after disabling MMU.
Thanks for your help!
Hi Zhenyu,
If you seeing the whole system hang, then it's something to do with SCP firmware.
Can you look at the SCPI firmware version that gets printed out while Linux boots ?
Instead of you trying to manually to ioremap yourself, just enable the self hosted
ETM support in the kernel.
I am not sure of 16.06 release, they are few patches/fixes queued for v4.9 kernel, so
I would suggest to give that a try. It fixes a lot of crashes we have seen so far with ETM.
Hope this helps.
I have tried the 16.06 release, however, the situation does not change.
To make a easy experiment, i use the following steps:
1. Boot Android
2. Write a simplest kernel module, which use "ioremap" function to remap the memory region. Then dynamically load the module.
Then the system is stuck again. Normally an exception in the kernel module only fails the module, but it is weird that the whole system just keep stuck there.
Moreover, i tried to use ioremap to map a large region like,
ioremap(0x22040000, 0x1000);
or just map a small region like,
ioremap(0x22040314, 0x8);
but the result is similar.
As the cores and the clusters power domains are all powered up, is there anything else we may miss?
Thanks so much for your help!
Thanks so much for your reply, I will try to transplant our work to the lastest version of firmware.
I wouldn't doubt it at this point. The "doesn't work when I disconnect JTAG" issue is known, we used to hand out a CSAT script which would connect the DSTREAM and power up the DAP (without maintaining a session), which fixed a lot of things but it still requires the JTAG.. Later versions of the SCP firmare (BL2) don't seem to require this at all, and we're certain that there have been Linux patches recently that improve specification of the SCPI power domains for self-hosted ETM usage.
If you have problems after that, then we're happy to help.
The version of firmware installed on the board is v1.3.3 (printed in the console when the board reboots). The Android image and Linaro's source code we use is released in Sep 2015 (Linaro Releases). It is a little bit old as we start to working on Juno since last year. Does a upgrade of firmware help to our issue?
Can you tell us which, for example, Linaro release you're running that got you the Android filesystem and firmware currently installed on the board?
Ta
Thanks for the reply. I have tried to check TRCPDSR (offset 0x314) and TRCPDCR (offset 0x304) before, and the system just stuck there like accessing other registers.
My code looks like,
INFO("value of register TRCPDSR: 0x%x \n", *(uint32_t*)(0x2204000 + 0x314));
As I am working on Cortext-A57 core 0, so the base address is 0x2204000. I did not do anything related to debug or trace before this line. When the processor is executing this line, it just stuck there and no following instructions will be executed anymore. I guess that some errors occurs while accessing the address, but i can not read more information from ESR_EL3 as the JTAG is plugged out.
Moreover, not only the memory region of the trace registers, i tried many different addresses in the coresight debug and trace region (0x2000000-0x23350000), and none of the access succeed. It looks just like that this region haven't been
mapped to memory. I also tried to disable MMU and access the physical address directly, but that did not make sense, either. However, if i plug in JTAG and connect to the board with DS-5, the code works prefect and shows me the value of the registers. So I am really confused about it.
The ETM Architecture Specification really goes into some good detail on which parts of the ETM (and which registers..) are in which power domains. The important ones, though, are TRCPDSR and TRCPDCR which allow you to check status and control the ETM power domain from the ETM (in order to keep it up). Some of how they react depends on implementation, but you really should set TRCPDCR.PU before trying to access any trace registers that aren't in the Debug domain (i.e. some Management, and all Trace registers).
Do you have a flow of which registers you're accessing, and where you get "stuck"? It would be easier to point out what you're doing wrong than to list all the combinations of possibly accessing the ETM correctly and what behaviours might entail from it (it is an entire chapter in the ETM Architecture).
Thanks for your reply again. I tried to learn more about the power domains after your last reply. The ETMv4 manual gives 2 examples about trace unit core power domain and PE core power domains, and the memory-mapped programming interface in both example locates in debug power domain. Then i checked the code of Trusted Firmware to find out how can i power on the debug power domain and found that it provides interface to configure the core power state and cluster power state. As you mentioned, the debug logic is in the "big" cluster, so i guess maybe that is so-called "debug power domain" in Juno. However, after i get the power state from each processor, it shows that power states of all 6 processors and their clusters are 0 which is defined as ARM_LOCAL_STATE_RUN. Does that mean all the domains are powered up? My test steps are the following:
1. boot Android
2. launch an Android app
3. use a SMC instruction to enter EL3
4. use psci_get_target_local_pwr_states function to get the core power state and cluster power state of current processor.
5. trigger a secure SGI to all other processors and use the same funtion in step 4 to read the power states.
Moreover, you mentioned that
There are controls in the core and ETM logic which signal the power controller (SCP) to prevent or emulate power down of components,
Is the control a register or something else? I find that register DBGPRCR_EL1 could emulate the power down of the core power domain, and DS-5 modifies this register after connected to the board. However, after i manually config the register, my access to the region remain fails. Did i miss any other controls?
Thanks again for your help!
Now you're at the mercy of Linux (Android), which depending on the version of the kernel and device tree you use, have different definitions of the power domains in use and whether they can power them on and off. It is certainly possible that disabling cpuidle still powers down something that you desire to be turned on.
One thing the ETM is programmed to do when a debugger attaches to it is to set some ETM power up enables, which signal the rest of the system to prevent (or emulate) powerdown of the domain containing the ETM. It obviously requires the domain to be up when the debugger attaches to do so. It also requires the system to be monitoring and respecting that signal. It's possible that the collusion of your kernel, DT, SCPI driver and SCP firmware are not entirely respectful of that process.
Lack of modification of the registers usually implies that the component is in reset (it'd be interesting to know if the registers all show 0 or the documented reset valyes), powered down (again, all 0?), or locked (DS-5 debugger does do this, but it also has a 'back door'). Note, I made a small mistake: APB_0<verify=0>:0x2000xxxx should actually have bit 31 set otherwise it may not be bypassing the CoreSight lock (which is not the same as the OS lock!). You should be able to look at the Juno RVC or RCF file within DS-5 or look at the Juno TRM for a description of the "ROM Tables" which will define the APB addresses, which are not the same view as the system addresses.
For your software, as long as the ETM is powered and not in reset, then the CoreSight lock and OS lock are your most likely candidates for any prevention of modification of registers.
Thanks so much for your reply. In fact, I am not sure whether it is caused by powers down. I am using Trusted Firmware and Linaro's Android release on Juno. After the Android system boots up, i disabled the idle state of each cpu by writing 1 to /sys/devices/system/cpu/cpu<m>/cpuidle/state<n>/disable. And if i connect DS-5 and check the status of cpus, all of them shows "running". Then I enter EL3 by a smc instruction and try to access the memory region of ETM registers in EL3, and then stuck there if i did not connect JTAG. So the Trusted Firmware should have been run and both clusters should be powered at that moment.
As you mentioned, "EL3<verify=0>" and "APB<verify=0>" prefixes work well to access the registers. And after i modify the registers, no error occurs anymore. However, the modification does not really succeed. After i modified them, it seems that the value of the registers didn't change. I guess maybe the debugger resumes the value right after my modification.
Also thanks for the reminder on the locks, i am actually a beginner in ETM, and i will try to get more information from the manual.
View all questions in Arm Development Platforms forum