This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

AT91SAM7X SYS IRQ: RL-ARM problems by change from edge to level mode (for RTT etc.)

Hi there all at91sam7x developers,

we have really big problems running the rl-arm library (MDK-ARM RealView Microcontroller Kit 4_01) since we need to make use of the sysirq of the at91sam7x512 device which is not only used by the PIT we use for the scheduler timer tick but shared with other chip components like DBGU, RSTC, RTT, WDT and PMC.

We cannot change to the three other timers (TCx) because we use them already for other pupose.

In the default setting of edge mode the RTT cannot work properly and end up in a froozen scheduler and a always running idle task after 10-25 minutes (google retrieves other threads describing this similar).

We changed the OS_TINIT macro (in RTX_Config.c) to set the level mode for the sys irq (AIC->AIC_SMR[ID_TC] = (AT91C_AIC_SRCTYPE_INT_HIGH_LEVEL| AT91C_AIC_PRIOR_HIGHEST)) as everyone in the web suggests (google told us) and we noticed the macro AT91C_TFIRQ to trigger the clock interrupt by software cannot work anymore because the AIC does not support software triggering of irqs in level mode.

At the moment I'm trying to change temporary back to edge mode of the AT91C_TFIRQ macro to make software triggering of the irq working again for it and keep my fingers crossed this will work stable for hours:

#define AT91C_TFIRQ
AT91C_LOCK; AIC->AIC_SMR[ID_TC] = (AT91C_AIC_SRCTYPE_INT_POSITIVE_EDGE| AT91C_AIC_PRIOR_HIGHEST); \
AT91C_UNLOCK; \
AIC->AIC_ISCR = OS_TIM_; \
AT91C_LOCK; AIC->AIC_SMR[ID_TC] = AT91C_AIC_SYSSRCTYPE; AT91C_UNLOCK

In level mode it works properly as soon as the load of os_xxx calls is very low. Having the full application running the system runs like the idle task runs (but others still run as breakpoints shows) only after about 1 hour. Things like os_sem_wait calls doesn't respond at os_sem_send calls happen and much more...

We have a mean stack loading of our 24 tasks of 50% max each.

Anyone experienced by using the sys irq of the at91sam7x in combination of the RL-ARM library driven by the PIT (not any TCx) out there?

Please help me!!

Best regards
Helmut

Parents
  • What I was thinking was: Isn't your problem that the scheduler somehow may miss the sys_ctrl interrupt, resulting in the scheduling stopping.

    In that case - is it possible that another interrupt handler that gets activated regularly (possibly from any of the other three timers) may notice that the scheduling have stopped, and may then either reinitialize the sys_ctrl interrupt, or may manually trig a new sys_ctrl interrupt?

    The specific thing with an edge-trigged interrupt is that if that edge is missed, it stays permanently missed. But if any other interrupt handler gets trigged and is able to see time, it is possible to detect when a specific interrupt source for some reason no longer gets serviced.

    A hung timer interrupt can be seen by too long time passing without the ISR touching a global variable.

    A hung UART receive interrupt can be seen by the serial buffer having data available, but no UART_RX interrupt time-stamping a global variable.

    A hung 1-second RTC can be seen by an internal timer counting more than 1 second without any RTC interrupt. A hung timer can be seen if enough RTC one-second interrupts happens without the timer ISR timestamping a global variable.

    So in most situations, it is possible to write an application that can auto-repair an ISR that isn't running. Of course, such a mechanism is only a second-line repair backup. The first-line goal is to figure out why an ISR is no longer trigged, and see if there is a firmware bug that may be fixed. But now and then, chip manufacturers happens to release hardware that just can't be 100% trusted, in which case a "software watchdog" to supervise the problematic interrupt source is the only available route.

Reply
  • What I was thinking was: Isn't your problem that the scheduler somehow may miss the sys_ctrl interrupt, resulting in the scheduling stopping.

    In that case - is it possible that another interrupt handler that gets activated regularly (possibly from any of the other three timers) may notice that the scheduling have stopped, and may then either reinitialize the sys_ctrl interrupt, or may manually trig a new sys_ctrl interrupt?

    The specific thing with an edge-trigged interrupt is that if that edge is missed, it stays permanently missed. But if any other interrupt handler gets trigged and is able to see time, it is possible to detect when a specific interrupt source for some reason no longer gets serviced.

    A hung timer interrupt can be seen by too long time passing without the ISR touching a global variable.

    A hung UART receive interrupt can be seen by the serial buffer having data available, but no UART_RX interrupt time-stamping a global variable.

    A hung 1-second RTC can be seen by an internal timer counting more than 1 second without any RTC interrupt. A hung timer can be seen if enough RTC one-second interrupts happens without the timer ISR timestamping a global variable.

    So in most situations, it is possible to write an application that can auto-repair an ISR that isn't running. Of course, such a mechanism is only a second-line repair backup. The first-line goal is to figure out why an ISR is no longer trigged, and see if there is a firmware bug that may be fixed. But now and then, chip manufacturers happens to release hardware that just can't be 100% trusted, in which case a "software watchdog" to supervise the problematic interrupt source is the only available route.

Children
  • I use the sys irq in level mode so I cannot miss any irq. I do this for all irqs.

    Unfurtunality rl-arm's rtx_config.c sets the edge mode but works.

    When you use other peripherals that share this sys irq you need (before you may have edge too) to change to level mode (especially for the asyncronously to mck fired rtt). This level works fine but rl-arms scheduler requires triggering by a simple software command of the scheduler for additional runs.

    When this additional runs of the rl-arm scheduler are not happen too long internal queues seems to overflow (what is just ignored in rl-arm as much I can see in the source) and this will end to lost messages and for internal queues lost commands (done for any isr_xxx calls of rl_arm like isr_mbx_send etc.)

    Thanks for you valuable input. Go on and kick my brain to the right point;-)