I have a question about the correct setting of the Cortex-M interrupt priorities in CMSIS-RTOS RTX. The CMSIS-RTX online manual (for the latest version 4.78) does not talk about this issue, and in fact, does not explain at all how to write ISRs for CMSIS-RTX.
Correct setting of the Cortex-M interrupt priorities is important, because the CMSIS-RTOS RTX kernel apparently does not disable interrupts explicitly. Instead, the atomic execution of critical sections is implemented as the SVC (Supervisor Call) exception, which runs at priority higher than any task.
Specifically, the priority of the SVC exception is set in the NVIC to the lowest possible level (0xFF), so is the priority of the PendSV and SysTick exceptions. With this setting, the exceptions cannot preempt each other, so their execution is mutually exclusive. So far so good.
But, what about the application-specific interrupts. It seems that they too need to be prioritized in the NVIC at the lowest level (0xFF), because any other setting would allow the IRQs to preempt the SVC/PendSV exceptions and could lead to corruption of the internal RTX state.
Is this observation accurate?
If so, then RTX would essentially defeat the purpose of the NVIC ("nested" interrupt controller), as the interrupts would NOT be allowed to nest.
Second, the CMSIS-RTX users must be made aware that they need to manually adjust the priorities of all their interrupts to 0xFF, by calling NVIC_SetPriority(<irq>_IRQn, 0xFFU) for all their interrupts. This is necessary, because all IRQs have priority 0 (the highest) out of reset, and with this setting they would be allowed to preempt RTX critical sections.
And finally, the whole design of CMSIS-RTOS RTX around the SVC exception makes most of the RTOS code execute in the exception context of the Cortex-M processor. This immensely hinders debugging, because you lose the call-stack context. In fact, the whole RTOS code feels like debugging a humongous ISR. This is exactly counter to the best practice to "keep your ISRs short".
Any comments about these issues would be highly appreciated.
Miro Samek state-machine.com
The FPU setting is up to you. It is actually by task. Some tasks can use it and others do not have to use it. If a task has it enabled, the micro will automatically save the lower set of FPU registers and the HAL_CM4.c file you can see looks for the flag and if it is set it will save/restore the high registers.
Yes, the nature of the OS calls from an ISR is not apparent at all, especially if you have some preconceived notion of what you are looking for.
You can look at the RTX code (it is best to look at that first, and then the "shim" between CMSIS-RTX and RTX. In the File rt_event.c you can see a function called "isr_evt_set". This can safely be called from an ISR. You may notice (you will when you follow its path) that it does not actually make any calls to change any events in any tasks. It adds (safely) an entry into the "post service request" "ps-queue". It then sets the PendSV flag for later processing. You should be able to see this. Then you just need to look at the PendSV processing. In the PendSV processing this queue is serviced. All items are popped out of the queue and processed. It should be fairly easy to convince yourself that the PendSV is safe. The only "issue" would be the decrement of the item count in the queue and you should be able to satisfy yourself that this is safe. We have a similar situation on the insertion into the queue.
I certainly understand how this is not obvious that it is fully "ISR-safe". I had to assume that Keil must know what they are doing and tried to figure out how it was safe. It is that all OS function calls ARE made at priority level 0xFF, which you had already figured needed to be the case. It is the safely queuing of os calls to be made and the processing of them in the PendSV that explains how to allow Nested ISRs to actually behave properly. I have not mentioned yet that the post processing queue is of limited size and can overflow. This is really a catastrophic error with no nice way to recover from (you cannot just "try-again" later). The size of this queue is adjustable by the user, and the entries are not large so that helps limit the total space. This used to be hard coded to 16 entries and this really was sufficient for almost all applications, but it is not hard to picture that there could be a case that would require more so they added that flexibility.