Debugging a Cortex-M0 Hard Fault

There's many references to Debugging a Hard Fault on Cortex-M3 & M4; eg

niallcooling's Developing a Generic Hard Fault handler for Armv7-M

also:

http://supp.iar.com/Support/?Note=23721

https://community.freescale.com/thread/306244 - which references  http://www.keil.com/appnotes/files/apnt209.pdf

http://www.freertos.org/Debugging-Hard-Faults-On-Cortex-M-Microcontrollers.html

http://support.code-red-tech.com/CodeRedWiki/DebugHardFault

But hard to find anything specifically for Cortex-M0 (or M0+)

The Armv6-M Architecture Reference Manual seems to be saying that many of the features that the above references rely upon are not provided in Cortex-M0; eg, there's no CFSR and no HFSR.

I have managed to implement a Hard Fault handler (from suggestions above), and it is called when a Hard Fault occurs - just not sure how much of the information is actually valid/useful once I'm there...

Cheers,

Andy.

Parents
  • Hi Andy,

    While a fault status registers (and certain fault types) are not present in Armv6m, the basic process should remain the same. From the Hard Fault handler:

    1. Test bit 2 of the LR to determine whether the MSP or PSP was being used
    2. Read the Stack Frame pointed to by the Stack Pointer discovered in Step 1
    3. The Program Counter in the Stack Frame (xSP+0x18) will tell you the address of the instruction that caused the fault.

    jyiu's The Definitive Guide to the Arm Cortex-M0 has a good chapter detailing Fault Handling on the Cortex-M0.

  • Thanks Drew. My book has an appendix on Troubleshooting too.

    If using Cortex-M0+ processor, and if the Micro Trace Buffer (MTB) is available, then the instruction trace feature allows you to view the recent execution history. Application note covering usage of MTB in Keil MDK-Arm is available on Keil website: http://www.keil.com/appnotes/docs/apnt_259.asp

    In summary, when debugging HardFaults on Cortex-M0/Cortex-M0+ processors, several pieces of information are very useful:

    • Extract the stacked PC (you already mentioned that)
    • Check the T bit in the stacked xPSR
    • Check the IPSR in the stacked xPSR

    If the SP is pointing to an invalid memory location, then you won’t be able to extract the stack frame. In these occasions, you can:

    • Check if you have allocated enough stack space. Various tool chains have different way to provide the stack usage of the application code. In any case, stack usage analysis is something you should do anyway, even the program didn’t crash. Don’t forget that exception handlers also need stack spaces, and for each extra nested ISR (interrupt service routine), your need more stack space for the stack frame as well as the ISR code.
    • Add a few function calls in various places in your program to check for stack leaks. CMSIS-Core provides some functions to help accessing SP value (e.g. __get_MSP()), and you can use those functions to add stack checking code (e.g. the value of MSP should be the same everything when a function is called).
    • If you are not using an RTOS, you can use the banked stack pointer feature to separate the stack used by threads and handlers. In this way you can also add stack checking in the ISR with lowest priority level. Higher priority level ISRs cannot use this trick because the SP value can be different if there was a lower priority ISR running.
    • If you are using an RTOS, some of them (including Keil RTX) has optional stack checking feature.

    If the SP is pointing to a valid location, then you should be able to extract some useful information from the stack frame.

    • If the T bit in the stacked xPSR is 0, something is trying to switch the processor into Arm state.
    • If the T bit in the stacked xPSR is 0 and the stacked PC is pointing to the beginning of an ISR, check the vector table (all LSB of exception vectors should be set to 1).
    • If the stacked IPSR (inside xPSR) is indicating an ISR is running, and the stacked PC is not inside the address range of the ISR code, then you likely to have a stack corruption in that ISR. Look out for data array accesses with unbounded index.
    • If the stacked PC is pointing to a memory access instruction, usually you can debug the load/store issue based on the register contents (see below):

    Faults related to memory access instructions can be caused by:

    • Invalid address - check the address value
    • Data alignment issue (the processor has attempted to carried an unaligned data accesses)
    • For Cortex-M0+ processor, please check for memory access permission (e.g. unprivileged access to the NVIC register), or MPU permission violations.
    • Bus components or peripheral returned an error response for other reason.

    You can also get a HardFault exception if you executed SVC instruction in an exception handler with same or higher priority than the SVC priority level. The fault happened because the current context does not have the right priority level for the SVC.

    regards,

    Joseph

Reply
  • Thanks Drew. My book has an appendix on Troubleshooting too.

    If using Cortex-M0+ processor, and if the Micro Trace Buffer (MTB) is available, then the instruction trace feature allows you to view the recent execution history. Application note covering usage of MTB in Keil MDK-Arm is available on Keil website: http://www.keil.com/appnotes/docs/apnt_259.asp

    In summary, when debugging HardFaults on Cortex-M0/Cortex-M0+ processors, several pieces of information are very useful:

    • Extract the stacked PC (you already mentioned that)
    • Check the T bit in the stacked xPSR
    • Check the IPSR in the stacked xPSR

    If the SP is pointing to an invalid memory location, then you won’t be able to extract the stack frame. In these occasions, you can:

    • Check if you have allocated enough stack space. Various tool chains have different way to provide the stack usage of the application code. In any case, stack usage analysis is something you should do anyway, even the program didn’t crash. Don’t forget that exception handlers also need stack spaces, and for each extra nested ISR (interrupt service routine), your need more stack space for the stack frame as well as the ISR code.
    • Add a few function calls in various places in your program to check for stack leaks. CMSIS-Core provides some functions to help accessing SP value (e.g. __get_MSP()), and you can use those functions to add stack checking code (e.g. the value of MSP should be the same everything when a function is called).
    • If you are not using an RTOS, you can use the banked stack pointer feature to separate the stack used by threads and handlers. In this way you can also add stack checking in the ISR with lowest priority level. Higher priority level ISRs cannot use this trick because the SP value can be different if there was a lower priority ISR running.
    • If you are using an RTOS, some of them (including Keil RTX) has optional stack checking feature.

    If the SP is pointing to a valid location, then you should be able to extract some useful information from the stack frame.

    • If the T bit in the stacked xPSR is 0, something is trying to switch the processor into Arm state.
    • If the T bit in the stacked xPSR is 0 and the stacked PC is pointing to the beginning of an ISR, check the vector table (all LSB of exception vectors should be set to 1).
    • If the stacked IPSR (inside xPSR) is indicating an ISR is running, and the stacked PC is not inside the address range of the ISR code, then you likely to have a stack corruption in that ISR. Look out for data array accesses with unbounded index.
    • If the stacked PC is pointing to a memory access instruction, usually you can debug the load/store issue based on the register contents (see below):

    Faults related to memory access instructions can be caused by:

    • Invalid address - check the address value
    • Data alignment issue (the processor has attempted to carried an unaligned data accesses)
    • For Cortex-M0+ processor, please check for memory access permission (e.g. unprivileged access to the NVIC register), or MPU permission violations.
    • Bus components or peripheral returned an error response for other reason.

    You can also get a HardFault exception if you executed SVC instruction in an exception handler with same or higher priority than the SVC priority level. The fault happened because the current context does not have the right priority level for the SVC.

    regards,

    Joseph

Children
More questions in this forum