We are running a survey to help us improve the experience for all of our members. If you see the survey appear, please take the time to tell us about your experience if you can.
Hello,
Why does not RTX implement a reference counter for its "tsk_lock"/"tsk_unlock" mechanism? The way things are implemented at the moment, any task that calls "tsk_unlock" enables the scheduler. But this can cause problems in nested calls; And, can cause a RTX kernel corruption if some code calls this from interrupt mode while a task is executing a RTX API (which assumes scheduler is disabled). There is no indication in the user manual that the kernel can be damaged if the scheduler is enabled from interrupt mode (NOTE: enabling the scheduler from interrupt mode only sets an interrupt line which is inherently safe!). FreeRTOS has this feature (and much more). Why not RTX?
I know who I vote for...
all you have to do is enable the scheduler by mistake...
Sorry, but I don't consider a mistake in application code to be a particularly strong argument.
There are plenty of ways to destroy a kernel with mistakes in application code (e.g., invalid pointers). Would you expect Keil to catch all of those as well?
you pay for a licence - don't you think this is something Keil should be taking care of?
I'd say they have taken care of it. They supply the source code for the relevant sections and allow me (as a developer) to modify it as appropriate.
We seem to disagree. That's fine. Let me just say this: data corruptions of the type you mentioned can be detected by static or dynamic code analyzers. Calling "tsk_unlock" at an unfortunate moment cannot. Also, think of software maintenance. A deadly "tsk_unlock" can sneak into the software because of software reuse, or an inexperienced/unaware developer. Finding out what went wrong can be tedious and not trivial.
We seem to disagree. That's fine.
Agreed.
What I find strange is your apparent desire/need to scatter your code with the tsk_lock/tsk_unlock calls. I have very few of them and they are normally placed in well defined, easy to manage, positions. But, hey, maybe I'll be bitten soon and I'll change my point of view???
I'd be interested in knowing how Keil respond.
Cheers.
Actually, the source of the problem was this code:
#define SERVICE_WATCHDOG tsk_lock() ;\ g_application_alive_signals |= ( (int32u)1<<(os_tsk_self() - g_first_task_id) ) ;\ tsk_unlock() ;\
Which allows tasks to set their "alive bit" regularly without servicing the watchdog directly (which happens in one place only if all alive bits are set). Originally, it used a mutex, but "tsk_lock" and "tsk_unlock" are much cheaper, computation wise. However, given the nature of the code, what should have been done is disabling of _all_ interrupts for this short period of time, not just the scheduler (thus, a mutex was insufficient as well). Then, this code was introduced into a function that was called both from user mode and interrupt mode (LPC2478) which was called during a long data transfer via the SSP bus. That was enough to cause a arbitrary, random data corruption resulting in entry into abort mode or hanging in a loop in a RTX API.
I can see no need for tsk_lock/tsk_unlock.
The g_application_alive_signals needs to be updated atomic, if accessed also from the IRQ handler. That has nothing to do with the tsk_lock()/tsk_unlock(). The usage of those function is highly deprecated.
Just a short follow up:
Keil have updated the user manual of RTX to indicate that calling "tsk_unlock" from privileged mode is not allowed.