Dear all,
I am interested in a scenario where I want to host two guest OSes above a bare-metal hypervisor on an ARM mobile platform. The total available memory platform is 4GB where I want to expose exclusively 2 GB of continuous RAM to each guest OS. Could you please guide me through my two below concerns:
1- in case I change the FDT (device tree) of each guest OS to reflect exclusively 2 GB of continuous memory, can I be assured that the kernel of the guest OS will only access these 2 GB, and further it is not even aware of the existence of the other 2 GB of the memory on the platform?
2- I prefer that each guest OS manage directly its memory without the intervention of the host hypervisor (in other words the guest physical address reflect the actual hypervisor physical address), in such scenario can I resort to a XenARM-like hypervisor and just disable the second stage translation in the Xen hypervisor code? Would that work or is there actually a better way to do it? Please share your experience
Best wishes.
Thanks Daith,
Yes, my intention is to remove the hypervisor translation tables so that the output of the first stage translation (intermediate physical address IPA) is directly equivalent to the machine physical address i.e. no hypervisor second stage translation. And further, each of the two guest OSes can see/access only portion of the physical memory by modifying the device tree. My point is not only for performance (though performance is one of the benefits I believe ).
I am only interested in unmodified kernel, without any kind of paravirtualization.
What do you exactly mean by stages are separately changed? is that TLB translations? in such case wouldn't performance depend on the workload memory access patterns? e.g. regularity and locality of memory accesses?
Please let me know (ARM community) what do you exactly think about my initial two questions as those are very fundamental for my project. My initial experimentation and slight code modification seem somehow encouraging but not very sure
Thank you so much.
One of the advantages of stage 2 translation is that it prevents a guest (accidentally or maliciously) accessing the resources of other guests or the Hypervisor. Without it you are relying on the two guests being well behaved. I think it fair to say that it is unlikely that an OS would map in addresses that it believed didn't have anything at them, but certainly not impossible. The question becomes why are you using virtualization to give you two guests in the first place? If, for example, it was for sandboxing then relying on them being well behaved doesn't seem like a great idea.
Second stage translation does not need to be a big overhead, and it doesn't stop you from having flat mapped addresses (i.e. IPA==PA). You said you want to give each guest 2GB of contiguous RAM. The translation table format gives the option of 1GB blocks, so that's just two entries.
The overheads of the extra translation are very small, the second stage will be cached and the whole translation will be cached after the first access. What you are talking about is a very difficult thing to do which is the reason the hardware was put in. To ensure the guest OSs do not access other areas you'd need to make any translation tables inaccessible to the guest OS and interpret any access in the hypervisor. As opposed to that the TLB does not even have to be flushed when switching between guest OS's as the VMID can be used to identify each OS.