Hi,
I recently had problems with events (RTX isr events) resulting in the OS_ERR_FIFO_OVF error in os_error(). This prompted me to understand more about this mechanism. I will explain what I know and what I dont know, any input would be appreciated.
When isr_evt_set() is called, it seems that 2 entries are placed in the array os_fifo[]. The first I am guessing is the task control block for the task that must get the event and the second is the event ID. The size of this queue is set in RTX_Conf_xx.c as variable OS_FIFOSZ.
What I dont know: 1. How to flush this queue? 2. Does os_evt_wait..() remove an event from the queue? How? I cant see that to be true 3. How can one check how full the queue is. 4. Does os_evt_clr() clear one event (which one) or all events with the specified flags.
It's a pity the Keil/ARM don't make this information available
I don't see how you think flushing all events would represent recovery - it would leave a number of threads very confused since you have violated the assumptions made when they were designed.
That is why a reboot is a good idea. And an even better idea is to make sure you don't have too high ISR load and that your ISR do not set many events per interrupt.
The important thing here is that you can have many interrupt sources. And it's possible to set up the hardware so the processor just moves from one ISR to the next without ever returning back to "normal" operation - which means a situation where you can't expect to get a superloop or OS threads to get anything done. So a proper design must take into account how much the interrupts can stack up to make sure the processor always have capacity for your real-time requirements. Your issues comes from your program failing this part.
In this case you don't get any overflow of any queue of events for your threads to process. But the ISR lets your interrupts run their things as fast as possible - the primary goal is to have a minimum of latency to do what the ISR needs to do. And then - when the pending interrupts have been handled - the OS processes the results of any state changes your ISR introduced, allowing the OS to reschedule which thread that should next get processing time. So "clearing this queue" means "make the OS unable to know how to schedule". And overflowing this internal state means that you don't have enough time for normal code after the ISR has been taken care of.