Over the past few months I've been doing a lot of work on a Kinetis K24 processor, which is a Cortex-M4, running the MQXLITE RTOS. It also has a couple other SDKs built in and a surprising level of complexity for a CM4 application. What all that leads to is a frustrating number of faults, and I still have trouble catching a few.
I currently run with the usage fault, memmanage fault, and bus fault handlers disabled because I consider all these fatal errors. I'm only interested in logging the causes of the faults to a persistent storage and rebooting, so I force everything to escalate to hard fault.
I currently have a hard fault handler that looks like this:
__asm volatile ( " ldr r1, =last_fault \n" // get the persistent data address " mov r2, #1 \n" // store the fault type " str r2, [r1, #28] \n" " tst lr, #4 \n" // Determine which banked stack pointer we were using when the fault occurred " ittee eq \n" " mrseq r0, msp \n" // Load the appropriate stack pointer " andeq r4, r0, #0x80000000 \n" // And mark which one it was " mrsne r0, psp \n" " movne r4, r0 \n" " str r4, [r1, #16] \n" // put away the stack register " ldr r3, [r0, #20] \n" // stored lr " ldr r2, [r0, #24] \n" // stored pc " ldr r5, [r0, #0] \n" // stored r0 " ldr r6, [r0, #4] \n" // stored r1 " str r3, [r1, #12] \n" // put away the lr " str r2, [r1, #8] \n" // put away the pc " str r5, [r1, #20] \n" // put away cached r0 " str r6, [r1, #24] \n" // put away cached r1 " ldr r2, handler2_address_const \n" // a handler that parses the fault status registers " blx r2 \n" " handler2_address_const: .word store_fault_info \n" " bkpt 255" // force a lockup and reset the chip );
This has served me well for a lot of simple faults - null pointer dereferences, etc. The handler reads the status information, writes it to a peripheral on the K24 called the "system register file" that persists through any reboot that isn't POR or low voltage, and I read it when I boot up.
However, I still get some faults that do not appear to trigger this handler - I get a reboot, and my persistent data is uninitialized. My core question is, why does my handler sometimes not execute when a hard fault occurs? And how can I make it more general to handle this case?