NOTE2: This code is inspired and optimized by the work of other authors, who better than me
knows ARM assembly and Cortex Architecture.
NOTE3: Re-entrant code is supposing that at interrupt exit the processor returns to task
space (being it on PSP or MSP). Hence to avoid messing the stack preemption function
should only be called by the lowest interrupt priority in the program.
Function description:
RIPrun( FUNCTION ):
- first pushes a dummy stack (only 32 bytes) on the stack and returns from the interrupt.
- The return address programmed in the dummy stack is in the same function code, so that the rest of the code executes as being in the process-thread mode (instead of having the interrupt priority)
- Once returned in the thread mode the code calls the function FUNCTION. This is a normal
function call (e.g. the stack is saved again by the processor mechanism)
- At return it generates a software triggered interrupt SVC to restore STACK
SVC_HANDLER
- Determines which SVC code was called.
- In case other code a traditional IntHandler is executed
- Otherwise we call RIPrestore who clean up the original interrupt stack.
NOTE: Why we restore stack in the SVC instead of using the RIPrun? Cortex CPU can process
two types of threading model, using one or two different stacks (PSP/MSP) when in appropriate mode.
Hence the original stack is being saved on a stack that is depending on the threading model. The
SVC call ensures that the processor recovers the stack appropriately.
MAJOR differences from Sippey code:
1st use of defines to decide which priority levels and callback procedures to use;
2nd all implementation are done using inline assembly from GCC
3rd use of naked "C" functions to limit overhead due to function call
4th RIPrun function locally encodes the return address (ADDW R0, PC,16 ; SKIP 8 Instruction from here) to ease the code.
/** * Reentrant Interrupt Procedure Call (RIPC) * * * ARM-GCC code to implement REENTRANT interrupt procedures. * Source of inspiration: * - "The Definitive Guide to ARM Cortex-M3 and Cortex-M4 Processors" 3rd ed. * - Sippey code for KEIL: Sippey (sippey@gmail.com) * * * * * ESSENTIAL INFORMATIONS * * CORTEX M4 Register Structure * - CPU 16 Register (R0-R15) + PSR * - FPU 32 Register (S0-S31) + FPSCR * * Calling CONVENTIONS: * - R0-R3, R12, LR, and PSR are called “caller saved registers.” * - R4-R11 are called “callee-saved registers. * * - S0-S15 + FPSCR are “caller saved registers.” * - S16-S31 are “callee-saved registers.” * * Typical Calling Layout * R0/R1 is Return Result Value if any * R0-R3 are parameter value (with the above exception) * R12 is a scratch register * R13 used to store SP * R14 link register (return address) * R15 is Program Counter * * * Stack Structure (growing from TOP to LOW memory) * BEWARE for efficiency Stack is manipulated aligned to 8 bytes always * in case of ODD number of registers it gets padded with white space * * Cases: NOFPU FPU * PREVIOUS TOP PREVIOUS TOP LAST STACKED ITEM * 32 (pad align 8) (PAd align 8) if PADDING present xPSR bit9 == 1 * 28 xPSR 96 FPSCR * 24 ReturnAddr 92 S15 * 20 LR 88 S14 * 16 R12 84 S13 * 12 R3 80 S12 * 8 R2 76 S11 * 4 R1 72 S10 * 0 R0* 68 S9 NO FP Stack pointer here * ============== 64 S8 * 8 REGs 60 S7 * 56 S6 * (total 8x4=32bytes) 52 S5 * 48 S4 * 44 S3 * 40 S2 * 36 S1 * 32 S0 * 28 xPSR * 24 ReturnAddr * 20 LR * 16 R12 * 12 R3 * 8 R2 * 4 R1 * 0 R0* FP Stack pointer here * ==================== * 8+17 = 25 REGS PADDED to 26 (Total 26*4=104bytes) * * The return address is the stacked PC * While Stacked LR was previous return address * * BX LR is return from subroutine * if LR start with 0xFxxxxxxxx then it is interpreted as Return from Interrupt (Exception Return) * Possible Exception return values are: * * if FPU was used before interrupt call * 0xFFFFFFE1 Return to another exception using MSP (Master) * 0xFFFFFFE9 Return to thread using MSP (Master) stack pointer * 0xFFFFFFED Return to thread using PSP (process) stack pointer * * if FPU was not used before CALL * 0xFFFFFFF1 Return to another exception using MSP (Master) * 0xFFFFFFF9 Return to thread using MSP (Master) stack pointer * 0xFFFFFFED Return to thread using PSP (process) stack pointer * */#include <misc.h>#include <stm32f4xx.h>// Lazy using strings to pass parameter to Assembly code#define SVC_CALL_NUMBER "0" // SVC_CALL_NUMBER being used#define PRI_LEVEL_LOCK "240" // Level 15 for STM32F4static void RIPCrun( void (*fcn)(void) ) __attribute__ (( naked, used ));static void RIPCrestoreSP( void ) __attribute__ (( naked,used ));/** * This is NEW default handler for standard SVC if used override if * required as usual in CM4 */__attribute__(( weak,used )) void SVC_Orig_Handler(){ while(1); // No other default service! Catch or return?}/** * \brief RIPCrun makes the interrupt reentrant. It pushes a dummy * stack, loads a fake return address depending on the FPU and call type * and returns. The return address is given as param. * Usage example * void SysTickHandler() * { * // NON REENT CODE BEFORE * RIPCrun(reentrant_Handler); // Call to reentrant code * } * * To avoid undesired preempt. The call is made in two stages, * first we call/return to RIPstub that on its own calls desired * Handler * * Note that the interrupt being made reentrant should have the lowest * priority. */static void RIPCrun( void (*fcn)(void) ){ // R0 at entry contains the jumping address __asm volatile(#ifdef __FPU_USED " TST LR, #0x10 \n" /* Test bit 4 to check usage of FPU register */ " IT EQ \n" " VMOVEQ.F32 S0, S0 \n" /* Mark FPU used for Lazy stacking operation */#endif " MRS R1, xPSR \n" // Should be xPSR ?? " PUSH {R1, LR} \n" /* Push PSR and LR on the stack*/ " SUB SP, #0x20 \n" /* Reserve additional 8 words for a complete dummy stack return*/ " STR R0, [SP] \n" // Pass the R0 to Callee in return " ADDW R0, PC,16 \n" // RIPCservice (SKIP 8 Instruction from here) " STR R0, [SP, #24] \n" // Handler Launcher in thread (Temp return addr) " MOV R0, #0x01000000 \n" // Generate a fresh new PSR " STR R0, [SP, #28] \n" // and store it (PSR) in proper offset " MOV R0, #0xFFFFFFF9 \n" // Create a return value for ISR return to MSP no FP (8 Word frame) " MOV LR, R0 \n" // and place it to LR to emulate standard ISR return " BX LR \n" // The return here will use our dummy stack // RIPCService /** * No we exited the interrupt and enter immediately here (SP+24 to this address). * At return the R0 register will be populated from the dummy stack with the parameter passed * to the RIPrun (ex R0) and we will jump there immediately. * Not this procedure call will be handled in MSP stack whatever would have been the original * THREAD stack (PSP or MSP). */ " BLX R0 \n" // RIPService Call function desired " MOVS R0, #" PRI_LEVEL_LOCK " \n" // Rearrange PRIORITY level to " MSR BASEPRI, R0 \n" // Block further trigger on our base interrupt " ISB \n" // ISB required to wait for BASEPRI effect (avoid further preemption) " SVC #" SVC_CALL_NUMBER " \n" // Replace here with desired syscall number// " BL RIPCerror \n" // SVC will reset stack, we should not return here ); while(1); // We should never get here, otherwise stack was messed up!}/** * \brief Control logic is the following * if (GET_SVC_NUMBER == SVC_CALL_NUMBER) * RIPsvc(); * else * SVC_Orig_handler(); * This handler and the RIPCsvc function are restoring the stack and hence should be protected against * further reentrant interrupt of the same kind otherwise the stack can be messed up. * The SVC handler always executes with MSP stack, but the original SVC service number can be stored in * MSP or PSP. Hence the initial test serves to properly extract the SVC number. */__attribute__(( naked )) void SVC_Handler(){ __asm volatile( " TST LR, #0x04 \n" /* Test EXC return bit 2 (MSP or PSP?)*/ " ITE EQ \n" // if 0 " MRSEQ R0, MSP \n" // Get SP from MSP " MRSNE R0, PSP \n" // else use PSP " LDR R1, [R0,#24] \n" // This is offset of stacked PC " LDRB.W R0, [R1, #-2] \n" // Check SVC calling service " CMP R0, #" SVC_CALL_NUMBER "\n" // Replace here with desired syscall number " BEQ RIPCrestoreSP \n" // use our modified SVC handler " B SVC_Orig_Handler \n" // else jump to the original handler ); while(1); // We should never get here, otherwise stack was messed up!}/** * \brief this function is called after the SVC handler properly identified we are * returning from a reentrant interrupt. * * OPERATIONS: * - We restore BASEPRI set to avoid nesting of SVC_handler (which produces a fault). * - We remove the stack provided by the SVC_Handler call. * - We recover PSR and LR as for the original storage in the RIPCrun * - We return this SVC using the stack pushed for the RIPCrun. * * DOUBT: Why triggering lazy stacking here? does it copies value in a dummy stack which * is trashed a couple of instruction later? */static void RIPCrestoreSP( void ){ __asm volatile( " MOVS R0, #0 \n" /* Use the lowest priority level*/ " MSR BASEPRI, R0 \n" // to renable the interrupt " ISB \n" // Ensure synchronization#ifdef __FPU_USED " TST LR, #0x10 \n" /* Test bit 4 to check usage of FPU register */ " IT EQ \n" " VMOVEQ.F32 S0, S0 \n" /* Mark FPU use for Lazy stacking operation */#endif " TST LR, #0x10 \n" /* Test bit 4 to check usage of FPU register */ " ITE EQ \n" " ADDEQ SP, SP, #104 \n" // Restore stack properly " ADDNE SP, SP, #32 \n" " POP {R0, R1} \n" /* Push PSR and LR on the stack*/ " MSR APSR_nzcvq,R0 \n" // Should be xPSR ?? " BX R1 \n" // Finally jump to R1 ); while(1); // We should never get here, otherwise stack was messed up!}#define TEST_REENT#ifdef TEST_REENT#define NESTLEVEL 20static int pass = 1;float NPI[20];unsigned int stackIN[NESTLEVEL];unsigned int stackOUT[NESTLEVEL];unsigned int nesting = 0;/** * \brief executes some FP operation. Marks stack at entrance and exit and * waits in the middle for a number of nested recursion. * * Note that the Stack consumption is about 72 bytes for nonFP reent * and 144 bytes for FP reent. This is due to the double procedure * call that is set at each interrupt (e.g. the original stack call * is preserved until the end + one procedure call get through the BLX * * We have 8 local bytes on the stack more * * Which makes 32+8+32 (Two complete stacks + 8 bytes for temporary PSR&LR) * * Or 104 + 32 + 8 = 144 in case of FP call stack * * The bytes overhead w.r.t. the standard mechanism is hence 40 bytes. * * Beware to have a large enough stack for reentrancy. */void ReentTickTest(){ register unsigned int *stackref; int a=0,lev; lev=nesting++; __asm__ ("mov %0, sp" : "=g" (stackref) : ); stackIN[lev]=(uint32_t)stackref; NPI[lev] = 3.1415926535f*lev; while((a<6)&&(pass==1)) { // Wait for Rentrancy a=nesting; } pass=0; __asm__ ("mov %0, sp" : "=g" (stackref) : ); stackOUT[lev]=(uint32_t)stackref; if (lev==0) pass=2;}void SysTick_Handler(){ RIPCrun(ReentTickTest);}int main(void){ float jj,kk; jj = 3.14; kk = jj*2; jj=kk; // The chosen IRQn should be the lowest in the system so that we are // sure that when this interrupt is exited we will return to thread // mode with a well not stack recovery mechanism. // // The alternative is to disable the interrupt in the code, but this // violates the rule of MAX 12cycles for interrupt latency which is // one of the best features of Cortex NVIC_SetPriority(SysTick_IRQn,15); SysTick_Config(16000); for (;;) { while(1) { if (pass==2) break; } pass=1; nesting = 0; }}#endif
Post Script:
In case you use optimization from gcc, remember to exclude the above functions from
optimizer.