I read https://developer.arm.com/documentation/ihi0070/latest/ but couldn't find any details about how it deals with huge pages.
Let's say I wanted to speed up my accelerator program of my custom SoC by (asking to and successfully) allocating a 256MB page (as I know I'm going to be accessing that range a lot). Does the SMMU have any optimizations built in to take advantage of that? My thought is that it would only need 1 translation entry to access any of the 256MB page. But I'm not sure that's how it works in relation to translation granules supported (4KB, 16KB and 64KB).
I'm not familiar with the MMU-700, but looking at the TRM it says:
"Optimization enables storage of all architecturally‐defined page and block sizes, including contiguous page and block entries, as a single entry in the TBU and TCU TLBs (WCs)"
https://developer.arm.com/documentation/101542/0102/Overview-of-MMU-700/Features?lang=en
So it seems the answer is that as long as software does the right thing in terms of mappings and contig bits then the MMU-700 can take advantage of it
Awesome, thanks!