This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Unable to determine offending instruction: usage fault illegal unaligned load or store cortex m7 keil mdk pro

Problem

Once in a blue moon (every ~500 hours of run time, non-deterministic) I am getting a Usage Fault/ Illegal unaligned load or store. Please, help me to trace the fault to actual offending instruction and extract additional info.

What have I done so far:

Per ARM AN209 (www.keil.com/.../apnt209.pdf) I have installed a hard fault handler.

extern C void HardFault_Handler(void) 
{ 
__asm volatile 
( 
tst lr, #4 \n 
ite eq \n 
mrseq r0, msp \n 
mrsne r0, psp \n 
ldr r1, [r0, #24] \n 
ldr r2, handler2_address_const \n 
bx r2 \n 
handler2_address_const: .word prvGetRegistersFromStack...Code

Values of the registers extracted from the exception stack frame are:

LR is 0x0803B131
PC is 0x0803A758
PSR is 0x01000000
// Extracted after printf executed
HFSR is 0x40000000 indicating Forced (I do not have a separate handler for UsageFaults)
CFSR is 0x01000000 indicating UsageFault, UNALIGNED access
xSPR is 0100 0003
MMFAR is 0
BFAR is 0

pc (0x0803A758) points to a dead loop of osRtxIdleThread.

This could not be the offending instruction, could it? What am I missing?

I have checked the RM0385 Reference manual, I see that the exception stack has been parsed properly.

@2001A198:

00000000 00000000 00000000 00000000   00000000 0803B131 0803A758 01000000
E25A2EA5

My system is:

Cortex M7 (STM32F746NG..)

Keil RTOS

Once the unit is stalled, I have connected to it using SEGGER J-Link/J-Trace for Cortex M (using Ozone), stopped the program, and examined the memory contents.

Unaligned access fault trap (UNALIGN_TRP) is disabled ( per reference it means that only multi-word instructions can generate this fault)

I read:

community.arm.com/.../debugging-a-usage-fault-for-an-unaligned-memory-access

www.keil.com/.../3777.htm (but no external SDRAM is used)

RM0385 Reference manual

also I read:

medium.com/.../the-curious-case-of-unaligned-access-on-arm-5dd0ebe24965

stackoverflow.com/.../unaligned-access-causes-error-on-arm-cortex-m4

stackoverflow.com/.../what-is-non-aligned-access-arm-keil

stackoverflow.com/.../arm-unaligned-memory-access-workaround

stackoverflow.com/.../unaligned-access-causes-error-on-arm-cortex-m4

Parents
  • Without direct access to your platform it is very hard to guess what is happening. On possible cause is stack overflow, but I am not sure how come the stacked PC showing the idle thread. Since you mentioned you have J-trace, ideally use J-Trace to collect instruction trace in real-time, that can be really useful in solving problems like this.

    A few other things you can try:

    - investigate if there is stack overflow in your RTX threads : in you RTX_Config.h, enable Stack overrun checking, and optionally enable Stack usage watermark.

    If you enabled Stack usage watermark, you can use RTX RTOS viewer (View->Watch windows -> RTX RTOS) to observe actual stack usage. Once the program run for a bit and then halted (by you, via the debugger), in the RTX RTOS window, you can expand the thread information there and you can then see stack usage details. (Note: enabling these check will increase context switching overhead, so normally this is enabled only during software development).

    - investigate if there is an overflow of main stack. First look at the stack usage report in the html file (in objects directory) to see the max stack usage, compared to the main stack allocated in the device startup file.

    - use event trace to observe what is the combinations of exception events happening just before the crash.

    - Potentially you can try setup a data watchpoint at the end of main stack (adding a data variable in the main stack declaration, and set the data watch point to it) to see if it hit. If it does, your main stack has overflowed.

    regards,

    Joseph

Reply
  • Without direct access to your platform it is very hard to guess what is happening. On possible cause is stack overflow, but I am not sure how come the stacked PC showing the idle thread. Since you mentioned you have J-trace, ideally use J-Trace to collect instruction trace in real-time, that can be really useful in solving problems like this.

    A few other things you can try:

    - investigate if there is stack overflow in your RTX threads : in you RTX_Config.h, enable Stack overrun checking, and optionally enable Stack usage watermark.

    If you enabled Stack usage watermark, you can use RTX RTOS viewer (View->Watch windows -> RTX RTOS) to observe actual stack usage. Once the program run for a bit and then halted (by you, via the debugger), in the RTX RTOS window, you can expand the thread information there and you can then see stack usage details. (Note: enabling these check will increase context switching overhead, so normally this is enabled only during software development).

    - investigate if there is an overflow of main stack. First look at the stack usage report in the html file (in objects directory) to see the max stack usage, compared to the main stack allocated in the device startup file.

    - use event trace to observe what is the combinations of exception events happening just before the crash.

    - Potentially you can try setup a data watchpoint at the end of main stack (adding a data variable in the main stack declaration, and set the data watch point to it) to see if it hit. If it does, your main stack has overflowed.

    regards,

    Joseph

Children