Hello,
I am trying to understand how the frame pointer works because I want to unwind the stack in a HardFault handler..
I am looking at a dissassembly that runs perfectly for an Atmel ATSAMV71Q21 Cortex M7. It was compiled with GCC in the AtmelStudio 7 IDE. To get the frame pointer, I compiled with -fno-omit-frame-pointer -mtpcs-frame -mtpcs-leaf-frame. It looks like that GCC used register r7 for the frame poitner thumb2 mode.
The function prologue has a push, a sub and an add. I like to confirm if the Cortex M7 superscalar 6 stage pipeline waits because of the dependency on SP at instruction at code address 0x00401dce from the push at 0x00401dcc?
Why doesn't the frame pointer point to something more predictable like the previously pushed r7 frame pointer or the previous SP value before entering the function?
int I2cHW::endTransmission(){ 401dcc: b5b0 push {r4, r5, r7, lr} /* SAVE REGISTERS. The stack moves down by 4*8 = 32 bytes. The frame is pointer is r7, the link register is LR*/ 401dce: b086 sub sp, #24 /*allocate 24 bytes on stack. Lower stack pointer by 24 bytes.*/ 401dd0: af04 add r7, sp, #16 /* frame pointer = stack pointer + 16. Why???*/ .... BODY REMOVED .... if (isAckValid){ 401edc: 7a63 ldrb r3, [r4, #9] 401ede: b11b cbz r3, 401ee8 <_ZN5I2cHW15endTransmissionEv+0x11c> return 0; 401ee0: 2000 movs r0, #0 }else{ return 1; } } 401ee2: 3708 adds r7, #8 /* move frame pointer by 8*/ 401ee4: 46bd mov sp, r7 /* stack poitner = frame pointer*/ 401ee6: bdb0 pop {r4, r5, r7, pc} /* restore registers. move stack up by 4*8= 32 bytes*/ return 1; 401ee8: 2001 movs r0, #1 401eea: e7fa b.n 401ee2 <_ZN5I2cHW15endTransmissionEv+0x116> 401eec: 2044f31c .word 0x2044f31c 401ef0: 00451e94 .word 0x00451e94 401ef4: 00451de4 .word 0x00451de4 401ef8: 00451fac .word 0x00451fac 401efc: 00451e5c .word 0x00451e5c 401f00: 0044e5ed .word 0x0044e5ed 401f04: 00447a71 .word 0x00447a71 401f08: 00440f71 .word 0x00440f71 401f0c: 00451f24 .word 0x00451f24 401f10: 00451fc4 .word 0x00451fc4 401f14: 00405421 .word 0x00405421 401f18: 00452004 .word 0x00452004 401f1c: 00451fec .word 0x00451fec
I'll push my code up to Github and post a link here. Might take a day or so...
Cool, thanks!
Hello @tobermory,
I ended up doing a call stack unwind function mostly in the way you described it here. I post my code. It doesn't look at R7 and doesn't care about stack frames.
P.S. This thread seem to have been encountering some editing and deletions.For a while, only a portion of the replies were visible.
__attribute__((naked)) void callStackUnwindIntoBuffer ( char *callstackunwindbuffer , int callstackunwindbufferLength){ asm volatile ( "MOV R2, SP\n\t" "b callStackUnwindIntoBuffer_c\n\t" ); } void callStackUnwindIntoBuffer_c( char *callstackunwindbuffer , int callstackunwindbufferLength, uint32_t *pStack ){ volatile uint32_t *locationOfLR; char *pDest = callstackunwindbuffer; int spaceLeft = callstackunwindbufferLength; uint32_t length; const uint32_t qtyCallStackLevels = 14; const uint32_t ignoredLevels = 0; /*locationOfLR = (uint32_t *) __get_MSP();*/ locationOfLR = pStack; char localBuffer[40]; extern char _sstack, _estack; int i=0; while ( i<qtyCallStackLevels ){ /*linear search for a valid LR addresses */ while( ( (( (*locationOfLR) & 0xFFFF0000 )< 0x00400000) || (( (*locationOfLR) & 0xFFFF0000 )> 0x004C0000) /*|| ( (*locationOfLR) & 1 == 0)*/ ) && (locationOfLR < &_estack) ) { locationOfLR++; } if( (i>= ignoredLevels) && (locationOfLR!=&_estack) ){ snprintf(localBuffer, sizeof localBuffer, "%08lx: 0x%08lx\r\n", locationOfLR, *locationOfLR ); length = strlen(localBuffer); if (length < spaceLeft){ snprintf(pDest, spaceLeft, "%s", localBuffer); spaceLeft = spaceLeft - length; pDest += length; } } i++; if ((locationOfLR>&_estack)){ i = qtyCallStackLevels; }else { locationOfLR++; } } snprintf(localBuffer, sizeof localBuffer, "END\n", locationOfLR, *locationOfLR ); length = strlen(localBuffer); if (length < spaceLeft){ snprintf(pDest, spaceLeft, "%s", localBuffer); spaceLeft = spaceLeft - length; pDest += length; } }
For completeness, here is my code that addresses fault dumps and inferred call stacks:
https://github.com/tobermory/faultHandling-cortex-m.git
Thanks. I had problems logging in, so this is the soonest I could respond.
If you compare your code above with mine (see the Github link), you'll see that you do the fault dump data formatting in the fault handler, i.e. after the fault has already occurred. I do it ahead of time, and use minimal code to fill in the register value 'holes'. I bypass sprintf entirely, preferring to hex format values by hand. I was nervous of calling into arbitrary C library routines once a fault had happened. I think the chance of a lockup (fault in fault handler) increases. On my board, a lockup defaults to a reset, and the fault capture would be lost entirely.
Hello tobermory,
>I bypass sprintf entirely, preferring to hex format values by hand.
The sprintf is a function I should avoid for embedded. It is not MISRA compliant. Also, sprintf is huge. I hade a coworker that used an embedded printf. It was corrupting memory because the implementation had a specific static length of buffer. He needed more than length than that. He spent a lot of time searching what went wrong.
For the concept of a fault handler, the sprintf is even less ideal.
Look at the first hit for a google search for 'Small printf source code'. I did not try. It could be interesting.
>I think the chance of a lockup (fault in fault handler) increases.
Agreed.