This discussion has been locked.

You can no longer post new replies to this discussion. If you have a question you can start a new discussion

A nasty feature of RTX

Hello,

Why does not RTX implement a reference counter for its "tsk_lock"/"tsk_unlock" mechanism?
The way things are implemented at the moment, any task that calls "tsk_unlock" enables the scheduler. But this can cause problems in nested calls; And, can cause a RTX kernel corruption if some code calls this from interrupt mode while a task is executing a RTX API (which assumes scheduler is disabled). There is no indication in the user manual that the kernel can be damaged if the scheduler is enabled from interrupt mode (NOTE: enabling the scheduler from interrupt mode only sets an interrupt line which is inherently safe!).
FreeRTOS has this feature (and much more). Why not RTX?

I know who I vote for...

Parents

0 H Mackie over 14 years ago in reply to Tamir Michael

all you have to do is enable the scheduler by mistake...

Sorry, but I don't consider a mistake in application code to be a particularly strong argument.

There are plenty of ways to destroy a kernel with mistakes in application code (e.g., invalid pointers). Would you expect Keil to catch all of those as well?

you pay for a licence - don't you think this is something Keil should be taking care of?

I'd say they have taken care of it. They supply the source code for the relevant sections and allow me (as a developer) to modify it as appropriate.
Cancel
Vote up 0 Vote down

Cancel

Reply

0 H Mackie over 14 years ago in reply to Tamir Michael

all you have to do is enable the scheduler by mistake...

Sorry, but I don't consider a mistake in application code to be a particularly strong argument.

There are plenty of ways to destroy a kernel with mistakes in application code (e.g., invalid pointers). Would you expect Keil to catch all of those as well?

you pay for a licence - don't you think this is something Keil should be taking care of?

I'd say they have taken care of it. They supply the source code for the relevant sections and allow me (as a developer) to modify it as appropriate.
Cancel
Vote up 0 Vote down

Cancel

Children

0 Tamir Michael over 14 years ago in reply to H Mackie

We seem to disagree. That's fine.
Let me just say this: data corruptions of the type you mentioned can be detected by static or dynamic code analyzers. Calling "tsk_unlock" at an unfortunate moment cannot. Also, think of software maintenance. A deadly "tsk_unlock" can sneak into the software because of software reuse, or an inexperienced/unaware developer. Finding out what went wrong can be tedious and not trivial.
Cancel
Vote up 0 Vote down

Cancel
0 H Mackie over 14 years ago in reply to Tamir Michael

We seem to disagree. That's fine.

Agreed.

What I find strange is your apparent desire/need to scatter your code with the tsk_lock/tsk_unlock calls. I have very few of them and they are normally placed in well defined, easy to manage, positions. But, hey, maybe I'll be bitten soon and I'll change my point of view???

I'd be interested in knowing how Keil respond.

Cheers.
Cancel
Vote up 0 Vote down

Cancel
0 Tamir Michael over 14 years ago in reply to H Mackie
Actually, the source of the problem was this code:

#define SERVICE_WATCHDOG tsk_lock() ;\ g_application_alive_signals |= ( (int32u)1<<(os_tsk_self() - g_first_task_id) ) ;\ tsk_unlock() ;\

Which allows tasks to set their "alive bit" regularly without servicing the watchdog directly (which happens in one place only if all alive bits are set). Originally, it used a mutex, but "tsk_lock" and "tsk_unlock" are much cheaper, computation wise. However, given the nature of the code, what should have been done is disabling of _all_ interrupts for this short period of time, not just the scheduler (thus, a mutex was insufficient as well).
Then, this code was introduced into a function that was called both from user mode and interrupt mode (LPC2478) which was called during a long data transfer via the SSP bus. That was enough to cause a arbitrary, random data corruption resulting in entry into abort mode or hanging in a loop in a RTX API.
Cancel
Vote up 0 Vote down

Cancel
0 Franc Urbanc over 14 years ago in reply to Tamir Michael

I can see no need for tsk_lock/tsk_unlock.

The g_application_alive_signals needs to be updated atomic, if accessed also from the IRQ handler. That has nothing to do with the tsk_lock()/tsk_unlock(). The usage of those function is highly deprecated.
Cancel
Vote up 0 Vote down

Cancel
0 Tamiryan Michael over 14 years ago in reply to Franc Urbanc

Just a short follow up:

Keil have updated the user manual of RTX to indicate that calling "tsk_unlock" from privileged mode is not allowed.
Cancel
Vote up 0 Vote down

Cancel