Hi,
I am currently working on enabling SMMU-v3 in hypervisor, I notice in SMMU-v3, there are several memory attribute configuration options.
1. SMMU_CR1
a. TABLE_SH for Table access Shareability
b. TABLE_OC for Table access Outer Cacheability
c. TABLE_IC for Table access Inner Cancheability.
same configurations for queue access( QUEUE_SH, QUEUE_OC, QUEUE_IC)
---------------------------
/* CR1 (table and queue memory attributes) */
reg = (CR1_SH_ISH << CR1_TABLE_SH_SHIFT) |
(CR1_CACHE_WB << CR1_TABLE_OC_SHIFT) |
(CR1_CACHE_WB << CR1_TABLE_IC_SHIFT) |
(CR1_SH_ISH << CR1_QUEUE_SH_SHIFT) |
(CR1_CACHE_WB << CR1_QUEUE_OC_SHIFT) |
(CR1_CACHE_WB << CR1_QUEUE_IC_SHIFT);
writel_relaxed(reg, smmu->base + ARM_SMMU_CR1);
------------------------------
always configure the register with the dedicated value without thinking about SMMU_IDR0 COHACC, bit [4]( Coherent access supported to translations, structures and queues.), if there is one SMMU with COHACC as 0, do not support Coherent access, it can only use non-cacheable memory, but the cr1 always configured with cacheablity, does the SMMU works properly?
2. STE
a. [169:168] S2IR0 for Inner region Cacheability for stage 2 translation table access.
b. [171:170] S2OR0 for Outer region Cacheability for stage 2 translation table access.
c. [173:172] S2SH0 for Shareability for stage 2 translation table access
thanks
I think this is probably more a driver question than a SMMU question. What SMMU_IDR0.COHACC is reporting is whether the SMMU supports IO coherency for accessing queues and config structures. What the ISH/OSH/.. fields are doing is software telling the SMMU what attributes to use for accessing different structures.
If the SMMU supports coherent memory, then software might choose to take advantage of that. Configuring the attributes used by the CPU and SMMU when each access the structures to exploit coherency.
Or the SMMU might not support IO coherency for these accesses. Or, the SMMU might support it, but software might specify attributes that don't exploit the support. In either case, the SMMU can still access memory. But it will be up to software to ensure coherency (e.g. by using cache ops). The attributes software chooses for the queues and tables will impact what work is required to ensure coherency.
Hi, martin,
thanks for your reply.
In fact, I do some work on arm fvp platform with cache_state_modelled=1,
1. if I set SMMU_IDR0.COHACC as 1, and cpu allocate cachable memory for queues and config constructures, the test engine works well.
2. Then, If I set SMMU_IDR0.COHACC as 0, and cpu allocate non-cachable memory for queues and config constructures, and I get an SMMU event C_BAD_STREAMID(0x2) or C_BAD_STE(0x4). 2.1 base on case 2, if I run fvp with cache_state_modelled=0, it also works well.
Base on case 2.1, I think the case 2 failure is resulting from cache actions other than the real streamid error(such as
a) L1STD.Span == 0b) L1STD.Span == Reservedc) L1STD.Span > SMMU_(S_)STRTAB_BASE_CFG.SPLIT + 1d) The given StreamID[SPLIT - 1:0] >= 2(L1STD.Span-1) ).
So, I am not sure how to configure the queue and config memory for SMMU in case 2. Can you share some details on this case 2?
Thanks
It does sound as if the problem is down to memory coherency problems.
Have you double checked the attributes used on the CPU for the memory? One test would be to see whether adding a Clean to the PoC (followed by a DSB) for the memory housing the queue/ST/etc.
I checked the attributes used on the cpu for the memory, the normal non-cacheable mair value(0x44) and the right index set to pte.
After I add an cache operation, SMMU works, and the upstream device can access memory normally. All the listed tests have the same result on real silicon chip and fvp revc platform. details as
base on the test above, it looks like the SMMU waiting for the CPU cache operation before it do the real memory access(access PTE or wirte event queue), and SMMU uses the cpu cache operation as the synchronizing signal.
Doing cache operation to non-cacheable memory looks wired. Can you share more detailed logic rules(similar to Pseudocode in arm arch spec) deep in the SMMU IP module? And why the SMMU take the logic rules?