We are running a survey to help us improve the experience for all of our members. If you see the survey appear, please take the time to tell us about your experience if you can.
Hello all,I want to improve VM operation in AArch64 port of FreeBSD but I stuck on following problem.The FreeBSD VM subsystem is capable to map various *kernel* objects by using superpage (higher order) mapping. But in lifecycle of these objects (because of COW or so), we must be able to break these superpage mappings back into normal (lower order) pages. Unfortunately, these objects can contain vital kernel data (kernel stack of other threads, etc) so we must be able to do this operation in atomic manner without doing standard break-before-make approach – in SMP environment it’s impossible to temporary unmap these object and any attempt to use some sort of serialization is contra productive.
Let me to give you exact example:Assume that we talking about stage 1 translation only, 4kB translation granule, contiguous bit is not used. I have 2MB level 2 block mapping and I want to break it into equivalent (by size, type and attributes) page mapping. So system prepare fully populated level 3 page table with equivalent page table entries, then atomically swaps level2 block entry with appropriate page table pointer entry and do flush TLB.The above approach looks safe for me, if given PE have block mapping already cached in TLB then it use it for any address within 2MB block, if not then it do table walk and uses new table entry. Also, this cannot confuse any already running page table walks on this or other PE. Here is nothing that can lead into multiple TLB entries undefined behavior.
But “D4.10.1 General TLB maintenance requirements“ of AArch64 ARM confuses me. Only (loosely) related part of this chapter is:----------------------------------------------------------------------Using break-before-make when updating translation table entries:To avoid possibly creating multiple TLB entries for the same address … the architecture requires the use of a break-before-make sequence when changing translation table entries whenever multiple threads of execution can use the same translation tables and the change to the translation table entries involves any of:…- A change to the size of block used by the translation system. This applies both:— When changing from a smaller size to a larger size, for example by replacing a table mapping with a block mapping in a stage 2 translation table.— When changing from a larger size to a smaller size, for example by replacing a block mapping with a table mapping in a stage 2 translation table.----------------------------------------------------------------------
There are some confusing items. - What exactly “size of block used by the translation system” is? Size of mapped memory by given entry/table? Or size of entry/table itself?- Why both paragraphs explicitly mentions “stage 2”? Does this meant that these are applicable to stage 2 only? I’m pretty sure that are applicable also for stage 1. Moreover, replacing a table mapping with a block mapping require break-before-make in any case, even if block sizes are equal.
So, please, can anybody confirm that proposed approach (replace block mapping with equivalent table mapping) without break-before-make is safe from architectural point of view? Or, if break-before-make approach is necessary by AArch64 ARM, can you give me example of failing path?
Many thanks,Michal Meloun
I can't think of an example to give.
Your point of view and experience is indeed shared by others: for instance, one of the links admits of an assumption that there's (very) little risk of a TLB conflict when splitting/demoting a block entry into an equivalent table entry. It also says that the assumption is not endorsed by the manual/architecture, and that it is complicated further if the block sizes differ between stage1 and stage2 translations.
UBoot's armv8 split_block function runs under the same assumption.
-------
Since the architecture asks for BBM during demotion, it does not forbid an implementation which breaks when the requirement is violated (admittedly, only for a short duration until the tlb invalidation arrives).
It seems that the consideration of the violation is noteworthy when the hardware is permitted to /write/ to the entries in memory (which then races with the software's intentions).
You're right.The architecture simply require BBM during demotion and is not reasonable to deny, ignore it. Even though my mind wants opposite result, even if all my testing passed :)FreeBSD is not my own hobby project, there is no space for violating architecture rules.Anyway, many thanks for your effort and help, and I apologize for my stubbornness.