This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Non allocatable reset of NXP LPC2368 using RTX when modifiing variables

Hello everyone,

today I'm asking for hints on a tricky problem. We have a firmware that uses RTX-Kernel running on a NXP LPC2368. Now the device that the firmware is written for should get a new lc display.
My honest mission is to change the firmware in order to use the new display.

I've spent some weeks this year to do so and some time I've had the problem that the controller resets short time after start and again and again...

Everytime this behaviour occured I have deleted one or more obsolete variables (mostly global) or functions. In most cases I solved the problem by searching other obsolete variables and deleting them from source code - try and error. That is really time-killing.

While testing the firmware on wednesday, I tried to make the adopted and modified routine for writing data to display RAM a little faster. I moved an global unsigned int to the function and changed it to static unsigned char because the value it has to carry is 0x0D at a maximum.

After flashing the firmware in the controller, the controller hung at a random short time.

Yesterday I was trying to solve the problem with hanging firmware on random time and found the problem when no task is running: OS calls os_idle_demon() and was not able to return from it. I found a solution in world wide web: Creating an empty low priority task without using any os_wait functions that prevents the OS from calling the idle task. (It has something to do with incorrect interrupt states on retunring from idle task.)

Today I further tried to make the display writing function faster and changed two unsigned char inside the function from static to non-static. After flashing this firmware the controller resets again and again. I will now try to find out why the controller behaves this way.

What I found out is, that no watchdog is enabled by user (is it part of the OS?). The os_stk_overflow an os_idle_demon are not called from OS. I debug the firmware using ULINK2.

Any ideas where to search the problem for?

Best regards

Parents
  • Good morning Marc, good morning Per.

    At first I want to thank you again for all the great input.

    I understand what you both have explained to me and I decided to implement your recommendations stepwise.

    => The first step is to implement the watchdog to force a real reset on error exceptions. This is a better behaviour than simply jumping to start adress - that's what I have learned. ;-)

    This step is nearly finished and it should be enough for the moment, since this is much more than I was instructed to do with the firmware. With this step I should be able to better debug the firmware in case of an error exception.

    => The second step is to implement the following mechanism:

    Active tasks do not wait infinite and every task has its own timer.

    The watchdog is kicked by the lowest priorized task only (actually my idle task) to ensure that enough cpu capacity is available.
    Furthermore, this task checks if all other tasks are running by checking their timers.
    If a task got stuck, the checking task can restart it.
    If restarting the deadlocked task is not enough (depending on the tasks function) the checking task could reset the whole device via external reset (I'm sure I can get access to it) after error information have been saved to memory (task name, register states, ...).

    If a program error exception is detected, the device would be restarted via external reset when the error conditions have been saved to memory by the regarding handlers.

    Is the second step consistent to that what you both have suggested?

    I have another question regarding the user defined stack size for every task.

    The RTOS is initialized by os_sys_init(task1).
    I want to have task1 its own user defined stack size using os_sys_init_user(task1, ...), but I do not want to reserve memory for its stack permanently.

    How can I provide memory dynamically for task1 that is finished by os_tsk_delete_self()?

    Looking forward for your answers.

    Best regards
    Robert

Reply
  • Good morning Marc, good morning Per.

    At first I want to thank you again for all the great input.

    I understand what you both have explained to me and I decided to implement your recommendations stepwise.

    => The first step is to implement the watchdog to force a real reset on error exceptions. This is a better behaviour than simply jumping to start adress - that's what I have learned. ;-)

    This step is nearly finished and it should be enough for the moment, since this is much more than I was instructed to do with the firmware. With this step I should be able to better debug the firmware in case of an error exception.

    => The second step is to implement the following mechanism:

    Active tasks do not wait infinite and every task has its own timer.

    The watchdog is kicked by the lowest priorized task only (actually my idle task) to ensure that enough cpu capacity is available.
    Furthermore, this task checks if all other tasks are running by checking their timers.
    If a task got stuck, the checking task can restart it.
    If restarting the deadlocked task is not enough (depending on the tasks function) the checking task could reset the whole device via external reset (I'm sure I can get access to it) after error information have been saved to memory (task name, register states, ...).

    If a program error exception is detected, the device would be restarted via external reset when the error conditions have been saved to memory by the regarding handlers.

    Is the second step consistent to that what you both have suggested?

    I have another question regarding the user defined stack size for every task.

    The RTOS is initialized by os_sys_init(task1).
    I want to have task1 its own user defined stack size using os_sys_init_user(task1, ...), but I do not want to reserve memory for its stack permanently.

    How can I provide memory dynamically for task1 that is finished by os_tsk_delete_self()?

    Looking forward for your answers.

    Best regards
    Robert

Children
  • Robert,

    Don't use dynamic memory if you can avoid it. It will lead to memory fragmentation - unless you allocate a big chuck at program startup and distribute it as will (memory manager...). This can indeed reduce your binary size significantly if you have enough RAM.

    the checking task can restart it.

    In my opinion this is a bad design. You should test to make sure no hangups can occur. And how will you "release" a task? What if its sitting in a

    for (;;) ;
    

    ?
    Better to reduce resource consumption and defer to the common, trusted and working solution of maintaining a bitmap, each bit indicating "task n is alive". If this regularly tested bitmap is not set entire to 1 (or 0) - one task, or preferably the hardware abstraction later will reset the device. This way, the watchdog is serviced from one place only, reducing the change of faulty code servicing it _even_ in the event of "task failure".

  • the checking task can restart it.

    Of course, deleting a task and re-spawning it is always possible in an attempt to introduce a "self correcting" system. Note, however, that this could lead to inter-task synchronization problems.

  • Hello Tamir.

    Thank you for your explanations. I will implement the stack for task1 as permanent memory reservation.

    Additionally, I will think over my plans for step 2 if the time has come to implement it.

    I have an urgent question regarding the watchdog. My device resets after a macro for kicking the watchdog is called.

    I implemented the watchdog as follows:

    // macro for kicking watchdog (in project.h):
    #define WDT_KICK { WDFEED= 0xAA; WDFEED= 0x55; }
    
    // watchdog initialization (in project.c):
    __task void task1(void)
    {
      // ...
      WDCLKSEL= 0x00000001;
      WDTC= 0x00000FFF;
      WDMOD= 0x03;
      WDT_KICK              // <= execution jumps to DAbt_Handler in startup file? 
      // ...
    }
    
    int main(void)
    {
      // ...
      os_sys_init(task1);
    }
    
    // startup file:
    ; Part 1: Physical vector table with Load Register Instructions (LDR) on each vector,
    ; loads values in constants table (32bit-wide memory locations) to program counter (PC) forcing a jump to memory location
    CDCVectors      LDR     PC, Reset_Addr
                    LDR     PC, Undef_Addr
                    LDR     PC, SWI_Addr
                    LDR     PC, PAbt_Addr
                    LDR     PC, DAbt_Addr
                    NOP                                                             ; Reserved Vector
    ;               LDR     PC, IRQ_Addr
                    LDR     PC, [PC, #-0x0120]              ; Vector from VicVectAddr
                    LDR     PC, FIQ_Addr
    
    ; Part 2: Constants table with addresses of jump targets
    Reset_Addr      DCD     Reset_Handler
    Undef_Addr      DCD     Undef_Handler                   ; RoS| 15.12.11: For watchdog (to get real reset) use Undef_Handler instead of Reset_Handler! Was: Reset if OP code isn't ARM or THUMB
    SWI_Addr        DCD     SWI_Handler
    PAbt_Addr       DCD     PAbt_Handler                    ; RoS| 15.12.11: For watchdog (to get real reset) use PAbt_Handler instead of Reset_Handler! Was: Reset on prefetch abort exception
    DAbt_Addr       DCD     DAbt_Handler                    ; RoS| 15.12.11: For watchdog (to get real reset) use DAbt_Handler instead of Reset_Handler! Was: Reset on data abort exception
                    DCD     0                                               ; Reserved Address
    IRQ_Addr        DCD     IRQ_Handler
    FIQ_Addr        DCD     FIQ_Handler                             ; RoS| 13.12.11: Jump target for the one and only fast interrupt; Actually: endless loop - means "not defined"
    
    
            IMPORT SWI_Handler                                                      ; RoS| imported SWI label
    
            EXTERN DAbt_Handler                                                     ; RoS| 29.11.11: for RTA usage (siehe http://www.keil.com/support/man/docs/ulink2/ulink2_ra_modifying_startup.htm) - external DAbt_Handler
    
    ; Part 3: Labels to which program jumps to if exception occurs,
    ; and then endless jumps to label
    Undef_Handler   B       Undef_Handler
    ;SWI_Handler     B       SWI_Handler                    ; RoS| 13.12.11: Another SWI_Handler is imported
    PAbt_Handler    B       PAbt_Handler
    ;DAbt_Handler    B       DAbt_Handler                   ; RoS| 29.11.11: for RTA usage (siehe http://www.keil.com/support/man/docs/ulink2/ulink2_ra_modifying_startup.htm) - endless loop obsolete
    IRQ_Handler     B       IRQ_Handler
    FIQ_Handler     B       FIQ_Handler
    
    
    ; Reset Handler
    
                    EXPORT  Reset_Handler
    Reset_Handler
    
    

    What may going on there? DAbt_Handler is part of the real time agent. How can I use this to find out what the error is?

  • Interrupts _must_ be disabled while servicing the watchdog !

    Use

    __disable_irq() ;
    

  • Or, even better - make the servicing function a SWI function.

  • *AAAARRRGGGHHH*

    I read about disabling interrupts... LPC23xx user manual:

    "Interrupts should be disabled during the feed sequence. An abort condition will occur if an interrupt happens during the feed sequence."

    Why don't they mark such important sentences with an exclamation mark in their user manuals?

    Thank you for helping out immediately Tamir!

    PS: I will re-read the insiders guide to check if I am able to implement the feeding of the predator via SWI function.

  • I implemented kicking the watchdog as an swi function first called in task1:

    // project.h:
    extern void __swi(12) WDT_KICK(void);   // RoS| 15.12.2011: feeds the watchdog
    
    // watchdog kicking and initialization (in project.c):
    void __SWI_12 (void)
    {
            WDFEED= 0xAA;
            WDFEED= 0x55;
    }
    
    
    __task void task1(void)
    {
      // ...
      WDCLKSEL= 0x00000001;
      WDTC= 0x00000FFF;
      WDMOD= 0x03;
      WDT_KICK();                           // <= execution STILL jumps to DAbt_Handler in startup file? 
      // ...
    }
    
    int main(void)
    {
      // ...
      os_sys_init_user(task1, 1, &id1_stk, sizeof(id1_stk));
    }
    
    // SWI_Table.s:
    // ...
    ; Import user SWI functions here.
      IMPORT  __SWI_8
      IMPORT  __SWI_9
      IMPORT  __SWI_10
      IMPORT  __SWI_11
      IMPORT  __SWI_12
    
      EXPORT  SWI_Table
    
    SWI_Table
      DCD     __SWI_0                 ; SWI 0 used by RTL
      DCD     __SWI_1                 ; SWI 1 used by RTL
      DCD     __SWI_2                 ; SWI 2 used by RTL
      DCD     __SWI_3                 ; SWI 3 used by RTL
      DCD     __SWI_4                 ; SWI 4 used by RTL
      DCD     __SWI_5                 ; SWI 5 used by RTL
      DCD     __SWI_6                 ; SWI 6 used by RTL
      DCD     __SWI_7                 ; SWI 7 used by RTL
    
    ; Insert user SWI functions here. SWI 0..7 are used by RTL Kernel.
      DCD     __SWI_8                 ; SWI 8  User Function
      DCD     __SWI_9                 ; SWI 9  User Function
      DCD     __SWI_10                ; SWI 10 User Function
      DCD     __SWI_11                ; SWI 11
      DCD     __SWI_12                ; SWI 12 User function - Kick Watchdog with interrupts disabled (except FIQ)
    



    The device resets again and again... It still jumps to DAbt_Handler when executing WDFEED= 0xAA;...
    Since there is no FIQ used, I have no idea why this is not working! :(

    Could this have something to do with the RealTime Agent?