What are the real advantages of using two stack pointers and different privilege levels on a MCU with no MMU?

The added overhead for context-switching and SVC handling pays off? With all threads sharing the same address space, user threads can overrun kernel stacks if the code says so.
Could anyone with more expertise elaborate on that?

Thank you very much.