We are running a survey to help us improve the experience for all of our members. If you see the survey appear, please take the time to tell us about your experience if you can.
Hello,
I have a major problem with RTX and Keil don't seem to be able to help (as they want a simple scenario to cause the problem, but I cannot give them the hardware of course. Maybe I can make it go wrong using an evaluation board). I'm using RTX as the backbone of a product that needs to run for extended periods of time without reboot (weeks...). The problem is that RTX stops executing arbitrary tasks at arbitrary moments - they remain 'ready' but not get services. Today I discovered a task entering 'WAIT_MUT' while not using ANY mutex. My question: Are there any tips using RTX correctly? I am growing totally frustrated and tired of this, what am I supposed to tell the client?! I'm using latest and so expensive RL-ARM without any results whatsoever. Can you share your experience with me?
Thanks you for your attention,
Tamir
Not good enough, means that a test can't prove correctness.
But a test that have x% probability of pinpointing the location of an error can still be meaningful.
The big problem here is estimate how large the percentage would be, i.e. the gain in relation to the cost.
The thing that is important to note, is that checksummed datastructures doesn't lead to correct programs. It is only a way to _maybe_ detect corruption.
In this case, checksumming could possibly tell what task was running during the corruption. And if all ISR sets a flag, then checksumming could possibly add a list of potential ISR to look closer at. But checksumming would possibly point at the wrong task, in case the memory corruption is caused by a DMA transfer, started by another thread but creating the corruption after a task switch.
But checksumming would possibly point at the wrong task, in case the memory corruption is caused by a DMA transfer, started by another thread but creating the corruption after a task switch.
ouch, you are so right. I overlooked that one...!
Per,
The point you made about the DMA transfers is indeed an issue. I never meant this to be something more a possible little help in case things are that much out of control (believe me, they were until a couple of days ago - nervous clients, nervous boss, nervous keyboard...). I don't think Keil are going to do this with RTX (there are other, more pressing issues...) - let's leave it as an intellectual exercise.
I regularly look at checksumming as one of the available tools to detect problems, but prefer to use it in situations where it can be included in the release build. Just as previously mentioned, it is best to test the same build that is expected to ship. It is enough to change a single byte in RAM or flash to make the debug build pass all tests (even if buggy) while the release build will fail - possibly in a routine the customer will only trig once every three months.
The reason I posted was that Hans-Bernhard Broekers post was aimed at pointing out that checksumming can't validate something as correct. But that is a separate issue from using it as a tool to detect something broken. A bigger issue with checksumming (at least when used in release builds) is to decide what action to perform in case of a checksum error. Auto-repair, reboot, deadlock, warn, ...
I read this topic with a lot of interest as it reminds me the lack of debug support RTX is providing .
Statistical data about % tasks execution times, state of a mutex , number of free memory blocks , number of free semaphores, etc. could be a VERY interesting improvement for the RTX library !!!