OSv guest encountering EC - "Unknown Reason" sync exception (ESR = 0x2000000) on Raspberry PI 4B host with KVM on

Hi,

I am one of the OSv unikernel developers and I have been stuck trying to figure out the problem described here - github.com/.../1100 - for a long time now to no avail. And I am looking for any more suggestions on how to debug it further.

In essence, occasionally (20-30% of the time) OSv guest running on QEMU with KVM on Raspberry PI 4B host encounters "Unknown Reason" (ESR_EL1 = 0x2000000) sync exception when running the same test application (does not seem to be specific to any given application). It seems that the longer the application runs the more frequently this exception happens. Here is the state of the registers from the guest gdb debug session captured when exception encountered and after saving exception frame on the stack (sorted for convenience):

(gdb) info reg
ACTLR_EL1 0x0 0
ACTLR_EL2 0x0 0
ACTLR_EL3 0x0 0
AFSR0_EL1 0x0 0
AFSR1_EL1 0x0 0
AIDR 0x0 0
AMAIR0 0x0 0
CLIDR 0x0 0
CONTEXTIDR_EL1 0x0 0
CPACR 0x300000 3145728
CSSELR 0x2 2
CTR_EL0 0x0 0
DACR32_EL2 0x1de7ec7edbadc0de 2154950976315703518
DBGBCR 0x0 0
DBGBVR 0x0 0
DBGWCR 0x0 0
DBGWVR 0x0 0
ELR_EL1 0x1000000311c0 17592186245568
ESR_EL1 0x2000000 33554432
FAR_EL1 0x1000000321b8 17592186249656
FPCR 0x0 0
FPEXC32_EL2 0x700 1792
FPSR 0x14 20
ID_AA64AFR0_EL1 0x0 0
ID_AA64AFR1_EL1 0x0 0
ID_AA64AFR2_EL1_RESERVED 0x0 0
ID_AA64AFR3_EL1_RESERVED 0x0 0
ID_AA64DFR0_EL1 0x0 0
ID_AA64DFR1_EL1 0x0 0
ID_AA64DFR2_EL1_RESERVED 0x0 0
ID_AA64DFR3_EL1_RESERVED 0x0 0
ID_AA64ISAR0_EL1 0x10000 65536
ID_AA64ISAR1_EL1 0x0 0
ID_AA64ISAR2_EL1_RESERVED 0x0 0
ID_AA64ISAR3_EL1_RESERVED 0x0 0
ID_AA64ISAR4_EL1_RESERVED 0x0 0
ID_AA64ISAR5_EL1_RESERVED 0x0 0
ID_AA64ISAR6_EL1_RESERVED 0x0 0
ID_AA64ISAR7_EL1_RESERVED 0x0 0
ID_AA64MMFR0_EL1 0x1124 4388
ID_AA64MMFR1_EL1 0x0 0
ID_AA64MMFR2_EL1_RESERVED 0x0 0
ID_AA64MMFR3_EL1_RESERVED 0x0 0
ID_AA64MMFR4_EL1_RESERVED 0x0 0
ID_AA64MMFR5_EL1_RESERVED 0x0 0
ID_AA64MMFR6_EL1_RESERVED 0x0 0
ID_AA64MMFR7_EL1_RESERVED 0x0 0
ID_AA64PFR1_EL1 0x0 0
ID_AA64PFR2_EL1_RESERVED 0x0 0
ID_AA64PFR3_EL1_RESERVED 0x0 0
ID_AA64PFR5_EL1_RESERVED 0x0 0
ID_AA64PFR6_EL1_RESERVED 0x0 0
ID_AA64PFR7_EL1_RESERVED 0x0 0
ID_AA64ZFR0_EL1 0x0 0
ID_AFR0 0x0 0
ID_DFR0 0x0 0
ID_ISAR0 0x2101110 34607376
ID_ISAR1 0x13112111 319889681
ID_ISAR2 0x21232042 555950146
ID_ISAR3 0x1112131 17899825
ID_ISAR4 0x11142 69954
ID_ISAR5 0x10001 65537
ID_ISAR6 0x0 0
ID_MMFR0 0x0 0
ID_MMFR1 0x0 0
ID_MMFR2 0x0 0
ID_MMFR3 0x0 0
ID_MMFR4 0x0 0
ID_PFR0 0x0 0
IFSR32_EL2 0x1de7ec7edbadc0de 2154950976315703518
MAIR_EL1 0xff440c0400 1096358298624
MAIR_EL3 0x0 0
MDCCSR_EL0 0x0 0
MDCR_EL3 0x0 0
MDRAR_EL1 0x0 0
MDSCR_EL1 0x0 0
MVFR0_EL1 0x10110222 269550114
MVFR1_EL1 0x12111111 303108369
MVFR2_EL1 0x43 67
MVFR3_EL1_RESERVED 0x0 0
MVFR4_EL1_RESERVED 0x0 0
MVFR5_EL1_RESERVED 0x0 0
MVFR6_EL1_RESERVED 0x0 0
MVFR7_EL1_RESERVED 0x0 0
OSLSR_EL1 0xa 10
PAR_EL1 0x1d000c7edbadc0de 2089683966090920158
PMCCFILTR_EL0 0x0 0
PMCCNTR_EL0 0x1de7ec7edbadc0de 2154950976315703518
PMCEID0_EL0 0x20001 131073
PMCEID1_EL0 0x0 0
PMCNTENCLR_EL0 0x5badc0c0 1538113728
PMCNTENSET_EL0 0x5badc0c0 1538113728
PMCR_EL0 0x4102302c 1090662444
PMEVCNTR0_EL0 0x1de7ec7edbadc0de 2154950976315703518
PMEVCNTR1_EL0 0x1de7ec7edbadc0de 2154950976315703518
PMEVCNTR2_EL0 0x1de7ec7edbadc0de 2154950976315703518
PMEVCNTR3_EL0 0x1de7ec7edbadc0de 2154950976315703518
PMEVTYPER0_EL0 0x1de7ec7edbadc0de 2154950976315703518
PMEVTYPER1_EL0 0x1de7ec7edbadc0de 2154950976315703518
PMEVTYPER2_EL0 0x1de7ec7edbadc0de 2154950976315703518
PMEVTYPER3_EL0 0x1de7ec7edbadc0de 2154950976315703518
PMINTENCLR_EL1 0x5badc0c0 1538113728
PMINTENSET_EL1 0x5badc0c0 1538113728
PMOVSCLR_EL0 0x1de7ec7edbadc0de 2154950976315703518
PMOVSSET_EL0 0x1de7ec7edbadc0de 2154950976315703518
PMSELR_EL0 0x1de7ec7edbadc0de 2154950976315703518
PMUSERENR_EL0 0x0 0
REVIDR_EL1 0x0 0
RVBAR_EL1 0x0 0
SCTLR 0x30d0199d 818944413
SPSR_ABT 0x0 0
SPSR_EL1 0x80000345 2147484485
SPSR_FIQ 0x0 0
SPSR_IRQ 0x0 0
SPSR_UND 0x0 0
SP_EL0 0x0 0
SP_EL1 0x200000200d60 35184374189408
TCR_EL1 0x15b5103510 93232051472
TPIDRRO_EL0 0x1de7ec7edbadc0de 2154950976315703518
TPIDR_EL0 0xffff900040f78000 -123144212348928
TPIDR_EL1 0x1de7ec7edbadc0de 2154950976315703518
TTBR0_EL1 0x407bf000 1081864192
TTBR1_EL1 0x407b5000 1081823232
VBAR 0x40207000 1075867648
cpsr 0x800003c5 -2147482683
fpcr 0x0 0
fpsr 0x14 20
pc 0x402077a0 0x402077a0 <entry_invalid_from_sync>
sp 0x200000200d60 0x200000200d60
x0 0x1 1
x1 0x2000000 33554432
x10 0xffffa00040ca9e00 -105552029245952
x11 0xffffa00040e1a600 -105552027736576
x12 0x100000032048 17592186249288
x13 0x1 1
x14 0x4100000000 279172874240
x15 0x0 0
x16 0x100000050000 17592186372096
x17 0x402b0240 1076560448
x18 0x0 0
x19 0x200000200ed0 35184374189776
x2 0x0 0
x20 0xffff900040f78000 -123144212348928
x21 0x1 1
x22 0x1 1
x23 0xffff900040a2f090 -123144217890672
x24 0x12ca90abac 80707890092
x25 0x0 0
x26 0xffffa00040f79200 -105552026299904
x27 0xffff8000407cd739 -140736406431943
x28 0xffffa00040f79278 -105552026299784
x29 0x200000200e80 35184374189696
x3 0x0 0
x30 0x1000000311b8 17592186245560
x4 0x1000000321e2 17592186249698
x5 0x200000200ee2 35184374189794
x6 0x76 118
x7 0x34 52
x8 0x34 52
x9 0x1 1

OSv, being unikernel runs its kernel and application in the same memory space at EL1. And here is the exceptions vectors table:

.macro vector_entry label idx
/* every entry is at 2^7 bits distance */
.align 7
        b       \label
.endm

.global exception_vectors
.type exception_vectors, @function
.align 12
exception_vectors:
        /* Current Exception level with SP_EL0 : unused */
        vector_entry entry_invalid  0   // Synchronous
        vector_entry entry_invalid  1   // IRQ or vIRQ
        vector_entry entry_invalid  2   // FIQ or vFIQ
        vector_entry entry_invalid  3   // SError or vSError

        /* Current Exception level with SP_ELx : only actually used */
        vector_entry entry_sync     4
        vector_entry entry_irq      5
        vector_entry entry_fiq      6
        vector_entry entry_serror   7

        /* Lower Exception level in AArch64 : unused since we don't go to EL0 */
        vector_entry entry_invalid  8
        vector_entry entry_invalid  9
        vector_entry entry_invalid 10
        vector_entry entry_invalid 11

        /* Lower Exception level in AArch32 : no El0, no AArch32 */
        vector_entry entry_invalid 12
        vector_entry entry_invalid 13
        vector_entry entry_invalid 14
        vector_entry entry_invalid 15

The most difficult issue is that I do not know why we occasionally encounter this exception. The ARMV8 documentation gives many reasons (developer.arm.com/.../esr_el1 but none of the ones I tried to investigate seems to match what I am seeing.

One of the reasons could be unallocated instruction or instruction using not-allowed system register (like MSR, MRS) but that does not seem to be the case. The same application seems to encounter the Uknown Reason exception (if it does) at various places in the code and there does not seem to be anything consistent about it, except it is always the app code and never the kernel part (see addresses above 0x0000100000000000). Here are some examples of instructions per ELR_EL1 (ESR_EL1 is always 0x2000000):

Address, Instruction
---------------------------------------
0x0000100000031000 d63f0020 blr x1
0x0000100000031140 f9404c01 ldr x1, [x0, #152]
0x00001000000311c0 94000182 bl 17c8 <*ABS*@plt+0x7b8>
0x0000100000031600 b4000040 cbz x0, 1608 <*ABS*@plt+0x5f8>
0x0000100000031640 f00000e0 adrp x0, 20000 <*ABS*@plt+0x1eff0>
0x0000100000031200 2a1503e0 mov w0, w21

As you can see some instructions involve memory access but some do not (last example) and they all seem to be valid instructions.

The existing OSv sync exception handler looks like that:

121 .global entry_invalid_from_sync
122 .type entry_invalid_from_sync, @function
123 entry_invalid_from_sync:
124 mrs x20, elr_el1 // Exception Link Register -> X20
125 mrs x21, spsr_el1 // Saved PSTATE -> X21
126 mrs x22, esr_el1 // Exception Syndrome Register -> X22
127
128 ubfm x23, x22, #ESR_EC_BEG, #ESR_EC_END // Exception Class -> X23
129 ubfm x24, x22, #ESR_ISS_BEG, #ESR_ISS_END // Instruction-Specific Syndrome -> X24
130
131 1: wfi
132 b 1b
133
134 .global entry_sync
135 .type entry_sync, @function
136 entry_sync:
137 push_state_to_exception_frame
138 mrs x1, esr_el1
139 ubfm x2, x1, #ESR_EC_BEG, #ESR_EC_END // Exception Class -> X2
140 ubfm x3, x1, #ESR_FLT_BEG, #ESR_FLT_END // FLT -> X3
141 cmp x2, #ESR_EC_DATA_ABORT
142 b.eq handle_mem_abort
143 cmp x2, #ESR_EC_INSN_ABORT
144 b.ne entry_invalid_from_sync
145 handle_mem_abort:
146 cbz x3, entry_invalid_from_mem_abort
147 cmp x3, #3
148 b.hi entry_invalid_from_mem_abort
149
150 mov x0, sp // save exception_frame to x0
151 bl page_fault
152 pop_state_from_exception_frame
153 eret

which makes OSv hang when encountering the Uknown Reason exception (see line 131,132).

Now I discovered that if I change the entry_invalid_from_sync routine to ignore the exception, pop the frame and let processing continue (like in the modified code below), the app and kernel continues and triggers the exact same exception again in the exact same address (same ELR_EL1 value) over and over again (~5,000 to ~20,000,000 times) UNTIL it eventually ceases to happen and the app successfully completes and OSv terminates. As I wrote at the very beginning of this email, the same app never encounters the EC=0 exception in 70-80% of the runs.

121 .global entry_invalid_from_sync
122 .type entry_invalid_from_sync, @function
123 entry_invalid_from_sync:
124 mov x0, sp // save exception_frame to x0
125 bl entry_sync_invalid_handler
126 pop_state_from_exception_frame
127 eret


Any ideas on what could be wrong or on how to further debug this problem? I can not see anything interesting in the host kern.log nor in QEMU log (qemu.log).

Please note that I have never been able to reproduce this issue with the same code (kernel and app) in the emulated mode (TCG) so it always seems to happen with KVM acceleration only.

Below is some information about specifics of the host, qemu, etc:

The host is Ubuntu 20.04.1 running Raspberry PI 4B with 4GB of RAM booting from SSD:

qemu-system-aarch64 --version
QEMU emulator version 4.2.1 (Debian 1:4.2-3ubuntu6.8)
Copyright (c) 2003-2019 Fabrice Bellard and the QEMU Project developers

uname -a
Linux ubuntu 5.4.0-1022-raspi #25-Ubuntu SMP PREEMPT Thu Oct 15 13:31:49 UTC 2020 aarch64 aarch64 aarch64 GNU/Linux

lscpu
Architecture: aarch64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 4
On-line CPU(s) list: 0-3
Thread(s) per core: 1
Core(s) per socket: 4
Socket(s): 1
Vendor ID: ARM
Model: 3
Model name: Cortex-A72
Stepping: r0p3
CPU max MHz: 1500.0000
CPU min MHz: 600.0000
BogoMIPS: 108.00
Vulnerability Itlb multihit: Not affected
Vulnerability L1tf: Not affected
Vulnerability Mds: Not affected
Vulnerability Meltdown: Not affected
Vulnerability Spec store bypass: Vulnerable
Vulnerability Spectre v1: Mitigation; __user pointer sanitization
Vulnerability Spectre v2: Vulnerable
Vulnerability Srbds: Not affected
Vulnerability Tsx async abort: Not affected
Flags: fp asimd evtstrm crc32 cpuid

The qemu command line:
qemu-system-aarch64 \
-m 128M \
-smp 1 \
--nographic \
-gdb tcp::1234,server,nowait \
-kernel /home/wkozaczuk/projects/osv/build/last/loader.img \
-append "--disable_rofs_cache /tests/tst-tls.so" \
-machine virt \
-device virtio-blk-pci,id=blk0,drive=hd0,scsi=off \
-drive file=/home/wkozaczuk/projects/osv/build/release/tests_disk.img,if=none,id=hd0,cache=none,aio=native \
-netdev user,id=un0,net=192.168.122.0/24,host=192.168.122.1 \
-device virtio-net-pci,netdev=un0 \
-device virtio-rng-pci \
-enable-kvm \
-cpu host \
-d all \
-D ./qemu.log


Here is also the OSv guest memory layout of the kernel, application, and app stacks just in case.

0x0000000040090000 0x00000000407ac000 [KERNEL]

0x0000100000000000 0x0000100000001000 [4.0 kB] flags=fmF perm=rx offset=0x00000000 path=/libvdso.so
0x000010000001f000 0x0000100000020000 [4.0 kB] flags=fmF perm=r offset=0x0000f000 path=/libvdso.so
0x0000100000020000 0x0000100000021000 [4.0 kB] flags=fmF perm=rw offset=0x00010000 path=/libvdso.so
0x0000100000030000 0x0000100000033000 [12.0 kB] flags=fmF perm=rx offset=0x00000000 path=/tests/tst-tls.so
0x000010000004f000 0x0000100000050000 [4.0 kB] flags=fmF perm=r offset=0x0000f000 path=/tests/tst-tls.so
0x0000100000050000 0x0000100000051000 [4.0 kB] flags=fmF perm=rw offset=0x00010000 path=/tests/tst-tls.so
0x0000100000060000 0x0000100000061000 [4.0 kB] flags=fmF perm=rx offset=0x00000000 path=/tests/libtls.so
0x000010000007f000 0x0000100000080000 [4.0 kB] flags=fmF perm=r offset=0x0000f000 path=/tests/libtls.so
0x0000100000080000 0x0000100000081000 [4.0 kB] flags=fmF perm=rw offset=0x00010000 path=/tests/libtls.so
0x0000100000090000 0x00001000000a3000 [76.0 kB] flags=fmF perm=rx offset=0x00000000 path=/usr/lib/libgcc_s.so.1
0x00001000000bf000 0x00001000000c0000 [4.0 kB] flags=fmF perm=r offset=0x0001f000 path=/usr/lib/libgcc_s.so.1
0x00001000000c0000 0x00001000000c1000 [4.0 kB] flags=fmF perm=rw offset=0x00020000 path=/usr/lib/libgcc_s.so.1


0x0000200000000000 0x0000200000001000 [4.0 kB] flags=p perm=none
0x0000200000001000 0x0000200000002000 [4.0 kB] flags=p perm=none
0x0000200000002000 0x0000200000101000 [1020.0 kB] flags=p perm=rw
0x0000200000101000 0x0000200000102000 [4.0 kB] flags=p perm=none
0x0000200000102000 0x0000200000201000 [1020.0 kB] flags=p perm=rw

My regards,
Waldemar Kozaczuk

PS. I have posted a similar question on QEMU ARM forum, but since then I have verified I can reproduce exact same issue when running OSv on Firecracker hypervisor (https://firecracker-microvm.github.io/) on the same host. Firecracker uses KVM acceleration exclusively. This makes me think that this issue probably is not caused by a bug in QEMU.

More questions in this forum