This discussion has been locked.

You can no longer post new replies to this discussion. If you have a question you can start a new discussion

cortex m7 frame pointer in prologue

Hello,

I am trying to understand how the frame pointer works because I want to unwind the stack in a HardFault handler..

I am looking at a dissassembly that runs perfectly for an Atmel ATSAMV71Q21 Cortex M7. It was compiled with GCC in the AtmelStudio 7 IDE. To get the frame pointer, I compiled with -fno-omit-frame-pointer -mtpcs-frame -mtpcs-leaf-frame. It looks like that GCC used register r7 for the frame poitner thumb2 mode.

The function prologue has a push, a sub and an add. I like to confirm if the Cortex M7 superscalar 6 stage pipeline waits because of the dependency on SP at instruction at code address 0x00401dce from the push at 0x00401dcc?

Why doesn't the frame pointer point to something more predictable like the previously pushed r7 frame pointer or the previous SP value before entering the function?

int I2cHW::endTransmission(){
  401dcc:	b5b0      	push	{r4, r5, r7, lr}       /* SAVE REGISTERS. The stack moves down by 4*8 = 32 bytes. The frame is pointer is r7, the link register is LR*/
  401dce:	b086      	sub	sp, #24			/*allocate 24 bytes on stack. Lower stack pointer by 24 bytes.*/
  401dd0:	af04      	add	r7, sp, #16      /* frame pointer = stack pointer + 16. Why???*/
  
  
  
  .... BODY REMOVED ....
  
      if (isAckValid){
  401edc:	7a63      	ldrb	r3, [r4, #9]
  401ede:	b11b      	cbz	r3, 401ee8 <_ZN5I2cHW15endTransmissionEv+0x11c>
        return 0;
  401ee0:	2000      	movs	r0, #0
    }else{
        return 1;
    }
}
  401ee2:	3708      	adds	r7, #8   /* move frame pointer by 8*/
  401ee4:	46bd      	mov	sp, r7		/* stack poitner = frame pointer*/
  401ee6:	bdb0      	pop	{r4, r5, r7, pc}  /* restore registers. move stack up by 4*8= 32 bytes*/
        return 1;
  401ee8:	2001      	movs	r0, #1
  401eea:	e7fa      	b.n	401ee2 <_ZN5I2cHW15endTransmissionEv+0x116>
  401eec:	2044f31c 	.word	0x2044f31c
  401ef0:	00451e94 	.word	0x00451e94
  401ef4:	00451de4 	.word	0x00451de4
  401ef8:	00451fac 	.word	0x00451fac
  401efc:	00451e5c 	.word	0x00451e5c
  401f00:	0044e5ed 	.word	0x0044e5ed
  401f04:	00447a71 	.word	0x00447a71
  401f08:	00440f71 	.word	0x00440f71
  401f0c:	00451f24 	.word	0x00451f24
  401f10:	00451fc4 	.word	0x00451fc4
  401f14:	00405421 	.word	0x00405421
  401f18:	00452004 	.word	0x00452004
  401f1c:	00451fec 	.word	0x00451fec

Top replies

tobermory over 3 years ago in reply to Silicium +2 verified

For completeness, here is my code that addresses fault dumps and inferred call stacks: https:// github.com/tobermory/faultHandling-cortex-m.git

Parents

0 Silicium over 3 years ago in reply to Silicium

Hello @tobermory,

I ended up doing a call stack unwind function mostly in the way you described it here. I post my code. It doesn't look at R7 and doesn't care about stack frames.

P.S. This thread seem to have been encountering some editing and deletions.For a while, only a portion of the replies were visible.

__attribute__((naked)) void callStackUnwindIntoBuffer ( char *callstackunwindbuffer , int callstackunwindbufferLength){


   asm volatile (


   "MOV R2, SP\n\t"
   "b callStackUnwindIntoBuffer_c\n\t"

   );

}

void callStackUnwindIntoBuffer_c( char *callstackunwindbuffer , int callstackunwindbufferLength, uint32_t *pStack ){

   volatile uint32_t *locationOfLR;
   char *pDest = callstackunwindbuffer;
   int spaceLeft = callstackunwindbufferLength;
   uint32_t length;

   const uint32_t qtyCallStackLevels = 14;

   const uint32_t ignoredLevels = 0;
   /*locationOfLR = (uint32_t *) __get_MSP();*/
   locationOfLR = pStack;

   char localBuffer[40];
   extern char _sstack, _estack;

   int i=0;

   while ( i<qtyCallStackLevels ){

       /*linear search for a valid LR addresses */

       while(
               (
                   (( (*locationOfLR) & 0xFFFF0000 )< 0x00400000)
                   ||
                   (( (*locationOfLR) & 0xFFFF0000 )> 0x004C0000)
                   /*|| ( (*locationOfLR) & 1 == 0)*/
               )
                   && (locationOfLR < &_estack)
           ) {
           locationOfLR++;
       }

       if( (i>= ignoredLevels) && (locationOfLR!=&_estack) ){
           snprintf(localBuffer, sizeof localBuffer, "%08lx: 0x%08lx\r\n",   locationOfLR, *locationOfLR    );
           length = strlen(localBuffer);
           if (length < spaceLeft){
               snprintf(pDest, spaceLeft, "%s", localBuffer);
               spaceLeft = spaceLeft - length;
               pDest += length;
           }

       }
       i++;

       if ((locationOfLR>&_estack)){
           i = qtyCallStackLevels;
       }else {
           locationOfLR++;
       }
   }


   snprintf(localBuffer, sizeof localBuffer, "END\n",   locationOfLR, *locationOfLR    );
   length = strlen(localBuffer);
   if (length < spaceLeft){
       snprintf(pDest, spaceLeft, "%s", localBuffer);
       spaceLeft = spaceLeft - length;
       pDest += length;
   }

}
Cancel
Vote up 0 Vote down

Cancel

Reply

0 Silicium over 3 years ago in reply to Silicium

Hello @tobermory,

I ended up doing a call stack unwind function mostly in the way you described it here. I post my code. It doesn't look at R7 and doesn't care about stack frames.

P.S. This thread seem to have been encountering some editing and deletions.For a while, only a portion of the replies were visible.

__attribute__((naked)) void callStackUnwindIntoBuffer ( char *callstackunwindbuffer , int callstackunwindbufferLength){


   asm volatile (


   "MOV R2, SP\n\t"
   "b callStackUnwindIntoBuffer_c\n\t"

   );

}

void callStackUnwindIntoBuffer_c( char *callstackunwindbuffer , int callstackunwindbufferLength, uint32_t *pStack ){

   volatile uint32_t *locationOfLR;
   char *pDest = callstackunwindbuffer;
   int spaceLeft = callstackunwindbufferLength;
   uint32_t length;

   const uint32_t qtyCallStackLevels = 14;

   const uint32_t ignoredLevels = 0;
   /*locationOfLR = (uint32_t *) __get_MSP();*/
   locationOfLR = pStack;

   char localBuffer[40];
   extern char _sstack, _estack;

   int i=0;

   while ( i<qtyCallStackLevels ){

       /*linear search for a valid LR addresses */

       while(
               (
                   (( (*locationOfLR) & 0xFFFF0000 )< 0x00400000)
                   ||
                   (( (*locationOfLR) & 0xFFFF0000 )> 0x004C0000)
                   /*|| ( (*locationOfLR) & 1 == 0)*/
               )
                   && (locationOfLR < &_estack)
           ) {
           locationOfLR++;
       }

       if( (i>= ignoredLevels) && (locationOfLR!=&_estack) ){
           snprintf(localBuffer, sizeof localBuffer, "%08lx: 0x%08lx\r\n",   locationOfLR, *locationOfLR    );
           length = strlen(localBuffer);
           if (length < spaceLeft){
               snprintf(pDest, spaceLeft, "%s", localBuffer);
               spaceLeft = spaceLeft - length;
               pDest += length;
           }

       }
       i++;

       if ((locationOfLR>&_estack)){
           i = qtyCallStackLevels;
       }else {
           locationOfLR++;
       }
   }


   snprintf(localBuffer, sizeof localBuffer, "END\n",   locationOfLR, *locationOfLR    );
   length = strlen(localBuffer);
   if (length < spaceLeft){
       snprintf(pDest, spaceLeft, "%s", localBuffer);
       spaceLeft = spaceLeft - length;
       pDest += length;
   }

}
Cancel
Vote up 0 Vote down

Cancel

Children

+1 tobermory over 3 years ago in reply to Silicium

For completeness, here is my code that addresses fault dumps and inferred call stacks:

https://github.com/tobermory/faultHandling-cortex-m.git
Cancel
Vote up +2 Vote down

Cancel
0 Silicium over 3 years ago in reply to tobermory

Thanks. I had problems logging in, so this is the soonest I could respond.
Cancel
Vote up 0 Vote down

Cancel
0 tobermory over 3 years ago in reply to Silicium

If you compare your code above with mine (see the Github link), you'll see that you do the fault dump data formatting in the fault handler, i.e. after the fault has already occurred. I do it ahead of time, and use minimal code to fill in the register value 'holes'. I bypass sprintf entirely, preferring to hex format values by hand. I was nervous of calling into arbitrary C library routines once a fault had happened. I think the chance of a lockup (fault in fault handler) increases. On my board, a lockup defaults to a reset, and the fault capture would be lost entirely.
Cancel
Vote up 0 Vote down

Cancel
0 Silicium over 3 years ago in reply to tobermory

Hello tobermory,

>I bypass sprintf entirely, preferring to hex format values by hand.

The sprintf is a function I should avoid for embedded. It is not MISRA compliant. Also, sprintf is huge. I hade a coworker that used an embedded printf. It was corrupting memory because the implementation had a specific static length of buffer. He needed more than length than that. He spent a lot of time searching what went wrong.

For the concept of a fault handler, the sprintf is even less ideal.

Look at the first hit for a google search for 'Small printf source code'. I did not try. It could be interesting.

>I think the chance of a lockup (fault in fault handler) increases.

Agreed.
Cancel
Vote up 0 Vote down

Cancel