Presumably the two stages are compiled independently, and the first doesn't remain resident (or offer any services) after it hands control over to the second. At that point, surely it doesn't matter if the RW, heap and stack of the second overlap with the first? It would appear that all the first needs to do is set the MSP to the top-of-RAM, and BX into the second stage.s.