In my Cortex M7 based project I am in a condition where often a reset takes place apparently for no reason. The only thing I can think of is that there has been a lockup.If I have a breakpoint at the first instruction of the Handler Reset is it possible then to know the cause of the lockup? In other words, how can I debug the lockup after the reset has taken place?
Best regards
Max
Hi Max,
Lockup does not automatically cause reset. Usually if a fault event take place, it will first enter one of the fault exception handler such as HardFault. You can place a breakpoint (either using debugger to set a breakpoint, or add a breakpoint instruction there. For example, you can use __BKPT() in C, which is an intrinsic defined in CMSIS-CORE header file - https://www.keil.com/pack/doc/cmsis/Core/html/group__intrinsic__CPU__gr.html )
If you do not define a HardFault_Handler, the default HardFault handler (likely to be in the startup code) might enter a deadloop. If you have a watchdog timer running, it will trigger a reset.
Another area you should check is whether you have allocated enough stack or heap memory for the application. If not, this can cause stack corrupt and might cause some functions to return to wrong address and might result in situations that it look like the system has been reset.
There are some pages that might help regarding debugging:
https://www.keil.com/appnotes/docs/apnt_209.asp
https://community.arm.com/iot/embedded/f/discussions/3257/debugging-a-cortex-m0-hard-fault
regards,
Joseph
in my code I have the enabling of faults and the code to analyze them, as explained here:
https://www.freertos.org/Debugging-Hard-Faults-On-Cortex-M-Microcontrollers.html
I have the fault enabled:
SCB->SHCSR |= SCB_SHCSR_USGFAULTENA_Msk | SCB_SHCSR_BUSFAULTENA_Msk | SCB_SHCSR_MEMFAULTENA_Msk; // enable Usage-, Bus-, and MMU Fault
I have no watchdog active, however the core is reset. My SOC (NXP i.MX RT1051) has a register that tells me that the reset was done by lockup. It happens quite early, after the startup code has populated the .data section and reset .bss in RAM. Then it jumps to the main(), and the first function it performs (to set pins and their multiplexing) does not exit. The things that are done within this function are mostly repetitive, and it fails about halfway. So I would like to understand what kind of event caused the lockup without (apparently) raising faults.
Is there a way to debug this?
best regards
Which tool chain are you using? If you have a debug probe and toolchain that support ETM instruction trace that might be the easiest way to see where the program gets to when the reset happened.
I use gcc-arm-none-eabi-8-2018-q4-major-linux.
i.MX RT1051 has ETM, but unfortunately, due to hardware constraints, only SWD signals are routed to a 10 pin connector. I don't know if the ETM peripheral also embeds ETB or are two mutually exclusive peripherals.
Potentially a chip can have ETM+ETB. In that case the trace data can be route to ETB and you will then able to capture the instruction trace if the debug tool support it.
> Then it jumps to the main(), and the first function it performs (to set pins and their multiplexing) does not exit.
Do you mean the system crash right after getting into beginning of main()? (the description is not very clear).
None of the documents related to RT1051 refers to ETB, but only to ETM.
Joseph Yiu said:Do you mean the system crash right after getting into beginning of main()? (the description is not very clear).
I'll try to explain better: the SOC RT1051 has a ROM bootloader (whose code is written by NXP and is obviously closed). At reset resetISR in ROM bootloader is executed. Then, after all the work the ROM bootloader does, the execution jumps into my startup code, which copies .data to RAM and resets .bss. WDTs are turned off and caches enabled, and then execution jumps to the main(). The main() enables faults and calls the function to initialize the multiplexing of GPIO pins and ports. This function is never exited. But going into assembler step-by-step debug everything works, so the only way I have to debug is to impose breakpoints by bisectioning the code.
So the crash happens many instructions after entering the main().
best regrads
Thanks.
How do you initialize the caches? There was a bug with the core_cm7.h that if the D cache has been initialized, and if the D-cache initialization function get call again it could cause a crash. That was fixed in CMSIS 5 github in Oct 2018.
That gives me hope. My CMSIS sources are given to me by NXP within their SDK. The latest version of the SDK incorporates CMSIS v5.0.5 from January 2018.
The code to enable cache is:
/* Enable instruction and data caches */ #if defined(__ICACHE_PRESENT) && __ICACHE_PRESENT if (SCB_CCR_IC_Msk != (SCB_CCR_IC_Msk & SCB->CCR)) { SCB_EnableICache(); } #endif #if defined(__DCACHE_PRESENT) && __DCACHE_PRESENT if (SCB_CCR_DC_Msk != (SCB_CCR_DC_Msk & SCB->CCR)) { SCB_EnableDCache(); } #endif
This is the only lines where cache is mentioned before the crash.
I tried to comment these lines, so if ROM bootloader enables cache, no one else will reenable them; otherwise if ROM Bootloader doesn't enable the cache it will run without them. Slowly but this avoid the bug, I hope.
Unfortunately the crash still happens.
>Unfortunately the crash still happens.
hmm.... have you tried increase the size of stack to see if there is a stack overflow problem?
Otherwise, breakpoint and bisectioning is possibly the easiest solution given that you don't have instruction trace.
(In case you want to get hold of the latest CMSIS release you can find it on
https://github.com/ARM-software/CMSIS_5/releases )