This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Cortex M4 exception return sequence

Hi,

I think I am just getting confused with this even if (or because of) I read the book and manuals again and again.

At exception entry, the processor saves R0-R3, R12, LR, PC and PSR on the stack. Saving PC means that the address of the instruction to be executed next after return from the exception handler is saved on the stack. However, the documentation also says that LR is updated with EXC_RETURN and that when the EXC_RETURN value is loaded to the PC, the exception return sequence begins.

So, the confusion is - it is the stacked PC value which should be loaded to the PC to return to the place where it left to attend the exception handler. If the PC is loaded with EXC_RETURN value, it is not a valid address but that only the lower 5 bits indicate which stack was used and the return mode (thread/handler) etc.

Or is it that the loading of PC with EXC_RETURN is then followed up by loading of the PC with the stacked PC value?

Can someone please help clearing this confusion?

Thanks,

Gopal

  • As you say the EXC_RETURN values are special values that are recognized by the hardware rather than proper pc values. Loading an EXC_RETURN value into the program counter initiates the hardware sequence that does the reverse of the sequence which happened when the interrupt came in. That reverse sequence will then load the actual pc to resume at. You don't explicitly load the various registers, that is all done automatically by the return sequence.

    Doing the entry and return in hardware allows the processor to not actually do the return sequence if there is a pending interrupt but immediately start handing the new interrupt instead without having to load  the registers on return and then store them again before entering the new interrupt handler..

  • Yes being able to write the interrupt routines as straightforward C functions is as you say real cool. And it explains the choice of registers which are stored, they are the ones the calling standard says a program is allowed to change without restoring on return. A good instance of the software and hardware working together.

  • Thanks jensbauer

    It was step 10 and 11 where I was not getting clear idea.

    Thanks daith for your inputs as well !

  • As daith says, it's possible that another interrupt will be handled by tail-chaining.

    This may occur between step 8 and step 9.

    jyiu once explained the details about this, that the registers R0-R3 and R12 will not contain values identical to what is on the stack on interrupt entry.

    -In fact, you can never trust what's in R0-R3 and R12, so if you need those values (for instance if you're using SVC, or if you're making some debug-facility), then fetch them from the stack.

  • -In fact, you can never trust what's in R0-R3 and R12, so if you need those values (for instance if you're using SVC, or if you're making some debug-facility), then fetch them from the stack.

    That makes sense given the wonderful features such as Tail chaining or pop pre-emption.

    As the AAPCS calls for, R0-R3 can be used as input parameters/arguments to the function being called, but it is rather safer that the function/subroutine should fetch the values from stack instead of directly referring to. I think the handler would not know under which circumstances it is executing - either because of tail chaining or it entering the handler from the thread mode.

    If the handler is entered from thread mode executing normal user program, then the R0-R3 will be having correct value but if it is something like tail chaining, those may not be correct.

  • To me, it looks like you've understood almost all of it correctly.

    Loading PC with the value of LR is sufficient. LR already holds EXC_RETURN, and you do not have to worry about which stack you need to use; the EXC_RETURN in LR is pre-encoded with the correct value.

    Normally you only have to change the EXC_RETURN value when you're writing a context-switcher.

    This is what more or less happens:

    1. An interrupt is signalled; a pending-flag is set.
    2. The interrupt is started, the registers xPSR, PC, LR, R12, R3-R0 are all pushed onto the interrupt-stack.
    3. The processor state is changed to use the interrupt-stack.
    4. The LR is loaded with the EXC_RETURN value (which is one of these: 0xFFFFFFF1, 0xFFFFFFF9, 0xFFFFFFFD, 0xFFFFFFE1, 0xFFFFFFE9 or 0xFFFFFFED).
    5. The PC is loaded with the address from the interrupt-vector.
    6. Your Interrupt Service Routine is executed.
    7. You make sure the LR register is saved/restored if it's changed.
    8. You finish your Interrupt Service Routine by executing a BX LR instruction.
    9. The EXC_RETURN value from the LR register is now moved into PC.
    10. The core now sees that this is a special return-address, so it restores the registers from the current stack.
    11. When the registers are restored, the execution continues where it was interrupted.

    The EXC_RETURN is actually one real cool feature of the Cortex architecture. It means that you do not have to have a RFI instruction (Return From Interrupt), as you use the standard "return-instruction" to return from an interrupt. So there's no real difference in writing an interrupt-routine and a normal subroutine for a Cortex-M based microcontroller.

    I remember in the 80's (speaking about the M68xxx in particular), you would have to unstack some 'info-words', depending on which kind of interrupt occurred, and you'd have to make sure that the stack-frame had the correct format. Sometimes you'd have to modify the values on the stack, before returning, and if that wasn't enough, you had great chances that your interrupt would crash the entire system if you made a minor mistake.

    It's not like that with an ARM Cortex-M microcontroller. The architecture makes your interrupt and exception handling much more robust; you don't have to worry too much about reentrancy, and if you're writing a context-switcher from scratch, ARM did most of it for you in hardware in advance.

  • That is correct.

    -Note: I wrote earlier that you can use the interrupt routine as a subroutine (or vice versa), but that implies that you're not passing parameters to the routine.

    -It also implies that the routine handles pending bits correctly, so that if it needs to clear the pending bits for the interrupt, it should not be clearing the pending bits if called as a subroutine.

    For a SysTick ISR, this is no problem, because there's no pending bits to clear, so you could actually 'bump' your systick counter by calling your systick interrupt (I don't have any idea on why one would do that at this moment, though; it's just an example).

    Uhm... Thinking about it, you can actually make it possible to pass parameters to a hybrid subroutine / Interrupt Service Routine.

    And again, I can't find a reason for doing so. But let's play with the idea...

    According to the Exception model, R0-R3 are at the bottom of the stack, so we can read them using a LDM instruction:

                        .thumb_func

    myISR:              ldm                 sp,{r0-r3}          /* [5] read four 32-bit parameters */

                        .thumb_func

    mySubroutine:

                        ... do some common things here

                        bx                  lr

    -So if calling the function manually, you'll call 'mySubroutine', where you've prepared R0-R3 in advance (that means you don't push the parameters onto the stack).

    If using it from an interrupt, R0-R3 would be read from the stack.

    But it doesn't make much sense doing this; it would be just as easy to branch to mySubroutine directly after setting the parameters in myISR, and then mySubroutine could be a plain standard C subroutine.

  • ARM recommends the following code

    __asm void SVCHandler(void)
    {
        IMPORT SVCHandler_main
        TST lr, #4
        ITE EQ
        MRSEQ R0, MSP
        MRSNE R0, PSP
        B SVCHandler_main
    }

    See SVC_Handler for Cortex M0

    It stores the state on the original stack, can't say I feel totally happy with that. Using sp will only work if PSP isn't used

  • Good input! -There are some flaws in my above 'example', which your snippet fixes.

    Detecting a subroutine call could probably be done by an unsigned compare against 0xffffff00; if 'higher', then it's an interrupt, otherwise it's a subroutine call.