I am porting a bare metal application to the Keil MDK-PRO middleware so that we can use USB HOST.
I have nested interrupt handlers in the code, most take less than 25us to run. I have one interrupt handler though that can run as long as 1.5ms and I am hoping this will not disrupt the RTOS.
This long ISR allows higher priority ISR's to nest within it. I realize that this could cause the SYSTICK 1ms interrupt to "skip" a millisecond and I am hoping this will not cause any problems for the RTOS.
Rewriting this ISR would take a lot of work and I have limited time. Moving the code from the ISR to a RTOS Task would slow the response time down a lot (from ~100us at present to over 1ms).
Anyone have experience with running long ISR's and CMSIS-OS?
The code is monitoring a UART running at 230400 baud with 2 stop bits and no parity, so roughly 50us per character. Packets are sent over the UART with a 3 character gap (150us) at the end of the packet.
When a character is received a timer is set for 2 character times (100us). The timer fires when the packet gap is received. At this point the code must quickly examine the packet header and generate a response. Any delays mean the UART is sitting idle and bandwidth is lost.
This UART is used in a multi-drop RS485 bus passing real-time I/O data at 100Hz. There is also RPC and stream data which is inter-leaved with the real-time I/O. The bus must also work at 115200 baud when the cable length is long (1Km). Fitting all this data into a 100Hz cycle means each cycle has just 10ms or 100 characters to work with.
The RS485 bus is a simplex master/slave arrangement (similar to MODBUS), meaning there can only be a single unit transmitting. This system has been working for several years and achieves these goals but now I must add USB Host to it, which has lead to this question.
Currently when the code detects a packet gap it starts a low priority ISR (which allows higher priority ISR's to interrupt it). This low priority ISR usually takes about 500us to copy the data from the UART receive buffer into internal storage, examine the header and fill the UART transmit buffer with the response. Rarely it can take up to 1.5ms which is my concern.
The difficulty with moving the response code in to a task is that the time from packet gap detection until the task runs would be over 1ms leading to a lost bandwidth (UART sitting idle).
Rewriting the entire bus system to avoid needing this quick response would be a major undertaking and my boss is already pressuring me to have this system working with USB yesterday :)
The simplest answer is if I can leave the existing IDR structure in place and use it with CMSIS-OS RTX. The RTOS should not really care if it cannot run for a couple of milliseconds, unless there is a counter running off the SYSTICK interrupt. I would imagine though that the RTOS would use the SYSTICK hardware counter rather than counting the interrupts, but I don't know (and I don't have the source code for RTX).
By the way the code is designed to "catch up" if it fails to run something at the requested time. For example if the 100Hz I/O loop does not run at the 10ms "tick" it will execute twice at the next 10ms tick. The system is deigned to be tolerant like that.
The ISR's, with the exception of the long "generate response for UART" ISR, all run in less than 20us. It is expected that higher priority interrupts will slow down the "generate response for UART" ISR, this is harmless as the bus protocol is designed to "catch up" as well.
The low priority bus traffic can simply be delayed to allow this to happen.
Some UART have a flexible enough FIFO that you can have the FIFO produce an interrupt after the receive stalls after two character cycles.
Another thing is that I would normally process data on-the-fly using a state machine if I need to be able to react with some action within a very short time after the last character of a message is received.
A third thing here is that you claim a very long time for copying the data at the end of the transfer. But the UART receive ISR should be copying the data on every receive interrupt and not do any extra copying when the message frame ends - either use a ring buffer or dual-buffering or whatever so you just make a single update of a counter, flag or pointer to make the full set of data available to the receive task - no additional copying should be involved at that stage unless it's data you need to write to EEPROM or similar.
Another thing - you say the code looks at the header of the received data. But can't the ISR directly understand the header so it knows what to expect?
If I look at NMEA (GPS) data in an ISR, I can have the ISR know that $ is start of a line and \n is end of a line. So the ISR can prepackage individual lines for the processing task. The ISR can even use a state machine so it notices that $GPRMC should be stored in one buffer, $GPGGA in another buffer, ... And it can compute the checksum on-the-fly so when it reaches the line break it already knows if the line has a correct checksum or not.
Most message-based systems either have unique characters for framing, or have a fixed header that contains enough information to know the length of the message. So the ISR should be able to process that information by consuming just a few microseconds and in the end manage the handover of a complete message while consuming just a few microseconds. It would be very strange if there is a need to consume 100us or more at the end of the message just for hand-over.
The goal with realtime systems is seldom to produce the fastest code possible. It's normally more important to see how the processing can be split to minimize the worst-case timing.
In your case, I suggest you take a very, very close look at that data copying, and figure out if you need it at all. Or if you can amortize it.
Thanks for the reply, I appreciate it. I don't have the option of changing the wire protocol which must be backwards compatible with MODBUS.
There is lot more detail which I don't want to bore anyone with, as to how and why the existing software is this way. Suffice it to say the UART has a ring buffer and there is a state engine that examines the packets (which runs in the long ISR). Data copying itself is fast and is only used to move data to and from the UART ring buffer to the application space.
If the software had been designed with an RTOS initially then it would have been designed quite differently. Unfortunately parts of the software go back 15 years and were designed for a much smaller microcontroller using a superloop.
If I cannot use the existing code with CMSIS-OS RTX because of this long ISR then I face a lot more work. Anyway thanks for taking the time to reply.