Hi all,
I'm trying to boot Linux on my hypervisor like environment.
In booting process, unexpected hyper trap was occurred and became hyp mode.
In hyp mode, the Hyp Syndrome Register (HSR) value is 0x93830006.
According to the manual, this meant "Fault not on a stage 2 translation for a stage 1 translation table walk" (EC is 0x24 and ISS[7] is 0).
I checked the page table, but it seems no problem.
Why is this hyp exception occured?
Thanks,
Takumi
Hi Matt,
Thank you for your advice.
The registers values are as follow:
hpfar: 0x908000
hdfar: 0x90800004
par: 0xa0d (after write HDFAR value to ATS12NSOPR)
hcr: 0x81019
hsr: 0x93830006
vtcr0x80000540
ttbcr: 0x8f010f00
ttbr0: 0x80003000
ttbr1: 0xd5082008f38
I also checked the table with gdb.
VA was 0x90800004 and TTBR0 was 0x80003000 and table was Long descriptor format.
Therefore, I checked as follow:
(gdb)
799 gic_irqs = readl_relaxed(gic_data_dist_base(gic) + GIC_DIST_CTR) & 0x1f;
1: /x $pc = 0x805bad64
(gdb) print/x 0x80003000
$1 = 0x80003000
(gdb) print/x *(0x80003000)
$2 = 0x80004003
(gdb) print/x *(0x80003010)
$7 = 0x80006003
(gdb) print/x *(0x80006000)
$8 = 0x8000040d
(gdb) print/x *(0x80006420)
$11 = 0x8f005003
(gdb) print/x *(0x8f005000)
$12 = 0x1401713
(gdb) print/x *(0x01401000)
$13 = 0x0
(gdb) print/x *(0x01401004)
$14 = 0xf807
But I tried to access to 0x90800004 directly, it failed.
takumishimada wrote: (gdb) print/x *(0x01401000) $13 = 0x0
takumishimada wrote:
Assuming you're picking the right table offsets, doesn't this imply a faulting entry at stage 1, which explains the fault status?
I am not sure why you check 0x01401004 -- the descriptor at 0x01401000 would be the one mapping that particular block (0x90800000 + 0x4 offset is always going to be well within the first block so it's the first descriptor of that table level).
Ta,
Matt
I'm sorry. I was confused.
(gdb) print/x 0x80003000 $1 = 0x80003000 (gdb) print/x *(0x80003000) $2 = 0x80004003 (gdb) print/x *(0x80003010) $7 = 0x80006003 (gdb) print/x *(0x80006000) $8 = 0x8000040d (gdb) print/x *(0x80006420) $11 = 0x8f005003 (gdb) print/x *(0x8f005000) $12 = 0x1401713
This means virtual address 0x90800000 is mapped to intermediate physical address 0x1401000 correctly, right?
Access to 0x01401000 means access to physical memmory virtual address 0x01401000 is mapped to, not internal physical address 0x01401000.
However, this seems to mean table walk on stage 1 is correct and never generate fault.
Why does access to 0x90800000 generate "Fault not on a stage 2 translation for a stage 1 translation table walk"?
Hi takumishimada,
Fault not on a stage 2 translation for a stage 1 translation table walk means the fault is in stage 1, not in stage 2, but stage 2 is involved (i.e. IPA was converted to PA at some point to resolve the actual descriptor location). But at the faulting point, the PA for the descriptor is fine, but the descriptor it points at indicates some kind of fault -- either a faulting entry, or permission problem, or something similar.
I did have one thought - 0x90008000 is in the upper part of the virtual address space, so it would be handled (given your TTBCR T0SZ(0x0) and T1SZ(0x1)) by TTBR1's tables, no?
The value of these registers are as follows:
ttbr1: 0x80003010
TTBR0's tables and TTBR1's tables are the same. It doesn't seem to be problem.
I'm sorry for my late reply.
That's true, it should work per B3.6.4 of the ARM ARM, given the base addresses..
What do your MAIR0&1 registers look like?
I also checked your registers again and it looks like you have HCR.DC==1 when you reported HCR again. Are you sure this is what you want? It implies that the SCTLR.M is never enabled otherwise all the MAIR indices in the descriptors in Linux will be ignored.. if you're mapping a device into that region and it gets a cacheable transaction (i.e. a linefill or eviction) it may respond extremely poorly.. it is also not a good idea to do table walks to Device memory. You could very easily test this concept -- set HCR.PTW=1.
I have to admit we're grabbing at straws here. To work out what's going on you need to have the full data set, some insight on what "hypervisor-like" means (i.e. your code).. this is not the place for that.
Your suspicion is right. I assumed that when SCTLR.M is enabled, HCR.DC is ignored.
I cleared HCR.DC, this fault is removed.
Thank you for your kind help.