I would like to precisely understand the implications of misprogramming the Contiguous bit in VMSAv8-64 translation tables.I have a hypervisor running at EL2 in the AArch64 execution state, using two-stage memory translation for the guests. At some point, the hypervisor needs to remove a guest's access to an IPA range, and sometime later restore it; one way to do so is by clearing the Access Flag in the relevant Stage 2 descriptors, then invalidating the relevant TLB entries, and sometime later setting the Access Flags again. Some other CPUs may be concurrently accessing the affected memory ranges.I foresee no problem as long as none of the affected descriptors boast the Contiguous bit. If some of them do, however, I cannot atomically update a block of adjacent descriptors constrained by the Contiguous bit, therefore I would temporarily violate the constraint, which other CPUs could observe.Ideally, I would like a guarantee for a worst-case scenario such as "if some descriptor d1 with value v1 has the Contiguous bit set, then any access involving a descriptor d2 in the same block b of adjacent descriptors as d1 may behave as if d2's value were consistent with v1" (here, consistent means satisfying the Contiguous bit constraint). This is typically what could happen if a translation table walk loaded d1, cached an entry for b in the TLB based on v1, and a subsequent walk reused this entry rather than loading d2.I turned to the ARMv8-A Reference Manual for answers, and got confused by the wording in section D4.2.6, under "Misprogramming of the Contiguous bit":
In some implementations, such misprogramming might also give rise to a TLB Conflict abort.The architecture guarantees that misprogramming of the Contiguous bit cannot provide a mechanism for any of the following to occur: Software executing at EL1 or EL0 accessing regions of physical memory that are not accessible by programming the translation tables, from EL1, with arbitrary chosen values that do not misprogram the Contiguous bit. Software executing at EL1 or EL0 accessing regions of physical memory with attributes or permissions that are not possible by programming the translation tables, from EL1, with arbitrary chosen values that do not misprogram the Contiguous bit. Software executing in Non-secure state accessing Secure physical memory.
In some implementations, such misprogramming might also give rise to a TLB Conflict abort.The architecture guarantees that misprogramming of the Contiguous bit cannot provide a mechanism for any of the following to occur:
It seems that I may have to worry about TLB Conflict aborts; beyond that, I am unsure what to expect. My interpretation of the manual is that misprogramming the Contiguous bit at privilege level X must never allow X to escape the access restrictions that more privileged levels have enforced, assuming that those levels do not misprogram the Contiguous bit. This would be too weak a guarantee for my purposes: in my case, EL2 would misprogram the Contiguous bit, but I would still like guarantees for the accesses from EL0 and EL1.Does this mean that the only portable options are not to use the Contiguous bit in such cases, or to make sure that other CPUs cannot access the affected ranges while the descriptors are being modified, and until the TLB entries have been invalidated?
Thank you for your reply.
I may not have to worry about this specific instance of TLB Conflict aborts, then. However, my main concern is that there are no guaranteed bounds to the effect at ELx of misprogramming the Contiguous bit at ELx. If my hypervisor rewrites the AF bits as I described, then all bets are off; all the ARM ARM promises is that Secure physical memory won't be accessed.
Is there a conclusion for this question? I have a similar doubt.
No. In the absence of new information, I decided to stick to the guarantees of the ARM ARM; therefore I do not use the Contiguous bit in descriptors that are concurrently modified and observed by other CPUs.
Thanks, I haven't found out how to guarantee the atomic characteristics when setting one of the PTEs within contig PTEs (clear or set), especially, when cpu access coming, what is the behavior of hardware during the setting PTE?
Anyway, post my question first, and hope some can answer:)