I got the EventRecorder to work at last, but in the component tab it doesn't show the name of the threads. They are all called "RTX Thread". I already entered a name in the osThreadAttr_t, but it is not visible during debugging. I can also see the thread_id in HEX format, but I would have to know those by heart which is not practical.
Concurrent execution of threads is not the same as concurrent execution of instructions from different threads. System Analyzer says multiple threads are in the RUNNING state. That doesn't mean the instructions for the code in each thread are executing concurrently at the timescale of the CPU clock.
System Analyzer gets its data from the EventRecorder, so everything being in the RUNNING state could just be an artifact of data logging falling behind (as you noted). You can trim back RTOS events to the critical ones you need to get debugging done, which might just be a few custom events.
The IDE will use bandwidth on the debugger link to maintain System Views, watched variables, memory, etc. Turn off any of those you don't need to maximize the bandwidth for EventRecorder.
Instead of osThreadYield() I prefer to wait for signals sent to the thread (which can be set in an ISR) and/or timers. That means most of the time is spent in the RTOS idle thread. On the other hand, use of osThreadYield() could be appropriate for your design. Unless things aren't working as desired.