This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Hard fault exception using RTOS and retargetting semihosting

Hi,

I've run into a strange problem using cmsis_os (keil rtx), retargetting of semi-hosting to uart, and interrupts.

I have a simple project set up to use cmsis-rtos (importing rtx_cm4.lib) and uart rx interrupts (using the stm32f4xx hal libraries). Using microlib everything works fine, but as soon as I move to retargetting semihosting (using basically just the retarget.c file from keil) I end up in the hardfault handler.

My code for main is just:

int main()
{
  // initialise the real time kernel
  osKernelInitialize();

  // we need to initialise the hal library and set up the SystemCoreClock
  // properly
  HAL_Init();
  configure_168MHz();

  // set up the uart (with rx interrupts enabled)
  init_uart(9600);
  enable_rx_interrupt();

  // print a status message
  printf("we are alive!\r\n");

  // start everything running
  osKernelStart();
}

and I get the "we are alive!" message displayed on the uart, but as soon as I type anything into the uart window, the uart rx interrupt triggers and then dumps me into the hardfault handler. However, there isn't much information in the registers. This is what I get:

in hard fault handler
SCB->HFSR  = 0x40000000
SCB->CFSR  = 0x00000000
SCB->MMFAR = 0xe000ed34
SCB->BFAR  = 0xe000ed38

stack dump:
SP         = 0x20001d40
R0         = 0x20001730
R1         = 0x0000ffff
R2         = 0x080005ed
R3         = 0x0000000a
R12        = 0x0800425d
LR         = 0x080038ef
PC         = 0x08000ac0
PSR        = 0x21000057

So I can tell I have ended up with a forced hard fault, but the CFSR register is empty so I can't tell what caused it.

Could anybody give me any pointers here? The same code works with no rtos, and the same code works with no retargetting - just not with both rtos and retargetting!

Thanks for your time!

Alex

  • Go look at exactly what instruction faulted. If it is a BKPT then you haven't got the fputc() stuff targeted properly.

  • Hi,

    Thanks for your response.

    It isn't BEAB BKPT (I have fixed that already :) - so it's not that fputc isn't implemented properly. However, I thought about it and I seem to remember reading in the ARM documentation that printf from stdio is thread safe (in that it uses mutexes to control access). I was using printf to echo characters from the rx interrupt callback. When I removed printf and replaced it with serial_write (which is just really a wrapper around the HAL uart transmit code) everything worked again :)

    Looking into it some more, the mbed rtos handbook mentions not using printf from isrs (which I guess is really what the rx callback is):

    developer.mbed.org/.../RTOS

    Is there anywhere else I need to be careful using printf from the standard libraries?

    Regards,

    Alex

  • Hi,

    Further to my last post, it doesn't appear to just happen when using printf in an interrupt handler, but also when using it in the uart_rx_thread ...

    My uart_rx_thread is simply:

    // uart receive thread
    void uart_rx_thread(void const *argument)
    {
            // print some status message ...
            printf("still alive!\r\n");
    
            // infinite loop ...
            while(1)
            {
                    // this osDelay is needed if the priority of this thread is greater than
                    // that of the others (otherwise it is always ready to run and therefore
                    // nothing else does!)
                    osDelay(1000);
            }
    }
    

    However, if I switch to using microlib everything works again. Is there a way of seeing what mutexes are being used? Is there any compelling reason to use the stdio library over microlib (I have looked at the differences but, apart from the mutexes, I couldn't see any reason not to use microlib)?

    The stack trace that has been dumped out to the debug viewer from my hard fault handler is:

    in hard fault handler
    SCB->HFSR  = 0x40000000
    SCB->CFSR  = 0x00008200
    SCB->MMFAR = 0x803a03a2
    SCB->BFAR  = 0x803a03a2
    
    stack dump:
    SP         = 0x20001dd8
    R0         = 0xf75cf6fc
    R1         = 0x20000234
    R2         = 0x803a03a0
    R3         = 0x00000000
    R12        = 0x0800582d
    LR         = 0x08005873
    PC         = 0x080059f2
    PSR        = 0x2100020b
    

    But if I go to 0x803a03a2 (which I believe should be the instruction where the bus fault occurred) all it says is:

    0x803A03A2 0000      MOVS          r0,r0
    

    Any ideas??

    Regards,

    Alex

  • To be honest I think looking at the code around PC and LR would be more enlightening. And digging through the stack frame.

    Watch for things blowing out the stack, or corrupting the content.

  • Would that not be the address access it is objecting too, not the instruction faulting location?

    Do you have code that is attempting to write to locked flash memory?

    More likely a STR R1,[R2]

  • Thanks for your help!


    Would that not be the address access it is objecting too, not the instruction faulting location?

    You are right, the Keil AppNote (http://www.keil.com/appnotes/files/apnt209.pdf) says:

    "The value of SCB->BFAR indicates the memory address that caused a Bus Fault and is valid if the bit BFARVALID in the SCB->CFSR register is set."


    To be honest I think looking at the code around PC and LR would be more enlightening. And digging through the stack frame.

    I'm afraid this is about where my knowledge ends! I have never been hugely comfortable with assembly code and, although I can just about work out what each instruction is doing, I find it really difficult to fit it all together into the bigger picture! Any help you can offer here (or pointers to further reading) would be gratefully received!

    PC points to:

    0x080059F2 7895      LDRB          r5,[r2,#0x02]
    

    LR points to:

    0x08005873 01E0      LSLS          r0,r4,#7
    


    Watch for things blowing out the stack, or corrupting the content.

    How can I do this?


    Do you have code that is attempting to write to locked flash memory?

    I don't think so - all I'm doing is printf from the thread.

    Thanks again for all your help!

    Regards,

    Alex

  • You'd want to look *around* the PC address, perhaps a couple of instructions prior. The PC points to the next instruction after the failing one, and stuff that's in write buffers can be queued from even earlier instructions.

    ie before and including 0x080059F2

    For the stack you'd want a clear understanding of the size and usage. Keil by default creates a 1KB stack which typically inadequate for most use cases, especially if stack hogs like printf/scanf are used. Fill the stack allocation with a fixed, non-zero character, that you can recognize when you display memory in the debugger, or code that walks the stack frame to determine maximal depth.

  • Thanks for your help and patience!

    I tried increasing the stack size to 5Kb, but this didn't make any difference. However, I thought that each thread had its own stack? In the debugger "system and thread viewer" window, there is some information about the stack load of each thread and nothing is above 32%.

    The instructions immediately around 0x080059F2 are:

    0x080059E6 2301      MOVS          r3,#0x01
    0x080059E8 788C      LDRB          r4,[r1,#0x02]
    0x080059EA E000      B             0x080059EE
    0x080059EC 4610      MOV           r0,r2
    0x080059EE 6842      LDR           r2,[r0,#0x04]
    0x080059F0 B112      CBZ           r2,0x080059F8
    0x080059F2 7895      LDRB          r5,[r2,#0x02]
    0x080059F4 42A5      CMP           r5,r4
    

    but these don't mean anything to me I'm afraid :~(

    I haven't yet tried your suggestion of filling the stack allocation with a fixed, non-zero character - I will try to find time for that tomorrow - but I'm pretty sure that I'm not just overflowing the stack (otherwise I would have thought that increasing the stack size by 5 times would have produced a different error at least!).

    Again = thanks for all your time!

    Regards,

    Alex