This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

A nasty feature of RTX

Hello,

Why does not RTX implement a reference counter for its "tsk_lock"/"tsk_unlock" mechanism?
The way things are implemented at the moment, any task that calls "tsk_unlock" enables the scheduler. But this can cause problems in nested calls; And, can cause a RTX kernel corruption if some code calls this from interrupt mode while a task is executing a RTX API (which assumes scheduler is disabled). There is no indication in the user manual that the kernel can be damaged if the scheduler is enabled from interrupt mode (NOTE: enabling the scheduler from interrupt mode only sets an interrupt line which is inherently safe!).
FreeRTOS has this feature (and much more). Why not RTX?

I know who I vote for...

  • <quote>(NOTE: enabling the scheduler from interrupt mode only sets an interrupt line which is inherently safe!).</quote>

    Tapeer,

    did U mean 2 say inherently unsafe????

    Always yo're freind.

    Zeusti

  • Look, you are clearly not intelligent enough to understand the problem. It would be nice if you actually stopped spamming the forum or at least showed the minimal courage to use your name or at least proper English.

  • Tapeer,

    // response removed to spare ops embarrassment

    Always yo're freind.

    Zeusti.

  • I have reported this issue to Keil support and will report to you if there are any developments.

  • Is this a problem or even an issue (apart from it not being explicit in the documentation)?

    The source code for this part is supplied (in RTX_Config.C and RTX_lib.C), so anyone thinking they want to or need to implement a reference counter is free to do so. It's an easy enough bit of code.

    Personally, I'm happy with the way it is currently implemented.

  • In my opinion this is a serious flaw in RTX.
    The above describes a cookbook scenario to destroy the kernel by mistake (or not...): all you have to do is enable the scheduler by mistake in an environment saturated with interrupts and you _will_ corrupt the kernel. Again: I have seen this happening. If you or anybody else are comfortable working with a operating system that has such a feature - suit yourself - that's cool.
    As for implementing this myself - first, we no longer use RTX in new products (FreeRTOS took its place. It is such a great piece of software!). Second - you pay for a licence - don't you think this is something Keil should be taking care of? I do.

  • all you have to do is enable the scheduler by mistake...

    Sorry, but I don't consider a mistake in application code to be a particularly strong argument.

    There are plenty of ways to destroy a kernel with mistakes in application code (e.g., invalid pointers). Would you expect Keil to catch all of those as well?

    you pay for a licence - don't you think this is something Keil should be taking care of?

    I'd say they have taken care of it. They supply the source code for the relevant sections and allow me (as a developer) to modify it as appropriate.

  • We seem to disagree. That's fine.
    Let me just say this: data corruptions of the type you mentioned can be detected by static or dynamic code analyzers. Calling "tsk_unlock" at an unfortunate moment cannot. Also, think of software maintenance. A deadly "tsk_unlock" can sneak into the software because of software reuse, or an inexperienced/unaware developer. Finding out what went wrong can be tedious and not trivial.

  • We seem to disagree. That's fine.

    Agreed.

    What I find strange is your apparent desire/need to scatter your code with the tsk_lock/tsk_unlock calls. I have very few of them and they are normally placed in well defined, easy to manage, positions. But, hey, maybe I'll be bitten soon and I'll change my point of view???

    I'd be interested in knowing how Keil respond.

    Cheers.

  • Actually, the source of the problem was this code:

    #define SERVICE_WATCHDOG   tsk_lock() ;\ 
                               g_application_alive_signals |= ( (int32u)1<<(os_tsk_self() - g_first_task_id) ) ;\ 
                               tsk_unlock() ;\ 
    

    Which allows tasks to set their "alive bit" regularly without servicing the watchdog directly (which happens in one place only if all alive bits are set). Originally, it used a mutex, but "tsk_lock" and "tsk_unlock" are much cheaper, computation wise. However, given the nature of the code, what should have been done is disabling of _all_ interrupts for this short period of time, not just the scheduler (thus, a mutex was insufficient as well).
    Then, this code was introduced into a function that was called both from user mode and interrupt mode (LPC2478) which was called during a long data transfer via the SSP bus. That was enough to cause a arbitrary, random data corruption resulting in entry into abort mode or hanging in a loop in a RTX API.

  • I can see no need for tsk_lock/tsk_unlock.

    The g_application_alive_signals needs to be updated atomic, if accessed also from the IRQ handler. That has nothing to do with the tsk_lock()/tsk_unlock(). The usage of those function is highly deprecated.

  • Just a short follow up:

    Keil have updated the user manual of RTX to indicate that calling "tsk_unlock" from privileged mode is not allowed.