Problem
Once in a blue moon (every ~500 hours of run time, non-deterministic) I am getting a Usage Fault/ Illegal unaligned load or store. Please, help me to trace the fault to actual offending instruction and extract additional info.
What have I done so far:
Per ARM AN209 (www.keil.com/.../apnt209.pdf) I have installed a hard fault handler.
extern C void HardFault_Handler(void) { __asm volatile ( tst lr, #4 \n ite eq \n mrseq r0, msp \n mrsne r0, psp \n ldr r1, [r0, #24] \n ldr r2, handler2_address_const \n bx r2 \n handler2_address_const: .word prvGetRegistersFromStack...Code
Values of the registers extracted from the exception stack frame are:
LR is 0x0803B131 PC is 0x0803A758 PSR is 0x01000000 // Extracted after printf executed HFSR is 0x40000000 indicating Forced (I do not have a separate handler for UsageFaults) CFSR is 0x01000000 indicating UsageFault, UNALIGNED access xSPR is 0100 0003 MMFAR is 0 BFAR is 0
pc (0x0803A758) points to a dead loop of osRtxIdleThread.
This could not be the offending instruction, could it? What am I missing?
I have checked the RM0385 Reference manual, I see that the exception stack has been parsed properly.
@2001A198:
00000000 00000000 00000000 00000000 00000000 0803B131 0803A758 01000000E25A2EA5
My system is:
Cortex M7 (STM32F746NG..)
Keil RTOS
Once the unit is stalled, I have connected to it using SEGGER J-Link/J-Trace for Cortex M (using Ozone), stopped the program, and examined the memory contents.
Unaligned access fault trap (UNALIGN_TRP) is disabled ( per reference it means that only multi-word instructions can generate this fault)
I read:
community.arm.com/.../debugging-a-usage-fault-for-an-unaligned-memory-access
www.keil.com/.../3777.htm (but no external SDRAM is used)
RM0385 Reference manual
also I read:
medium.com/.../the-curious-case-of-unaligned-access-on-arm-5dd0ebe24965
stackoverflow.com/.../unaligned-access-causes-error-on-arm-cortex-m4
stackoverflow.com/.../what-is-non-aligned-access-arm-keil
stackoverflow.com/.../arm-unaligned-memory-access-workaround
Joseph, thanks for you response.
Without direct access to your platform ...
The system is still powered up and can be examined. Let me know what other information do you think it may be valuable to extract?
FYI, I am having an interrupt based UART transfer and TCP/IP networking enabled.
Re: stack overflow
Usually, in case of stack overflow system goes through SIGABRT and termination, not to a hard fault handler. Do you think it might be different this time?
Re: use J-Trace to collect instruction trace in real-time
This is what I was planning to do. My trouble is that sometimes it takes 500 hours, sometime more to get to a fault. So I am trying to extract as much info from the present case as I can.
And even when I get an instruction trace in case of imprecise faults the offending instruction may be many instructions upstream of the execution flow.
Re: in you RTX_Config.h, enable Stack overrun checking and enable enable Stack usage watermark
Useful, thanks, will do.
Re: use RTX RTOS viewer ... to observe actual stack usage
Will do, thanks.
Re: Use event trace to observe what is the combinations of exception events
Will enable that, great idea
Joseph, Which of the information I provided makes you think it is stack overflow related?