OSv guest encountering EC - "Unknown Reason" sync exception (ESR = 0x2000000) on Raspberry PI 4B host with KVM on

Hi,

I am one of the OSv unikernel developers and I have been stuck trying to figure out the problem described here - github.com/.../1100 - for a long time now to no avail. And I am looking for any more suggestions on how to debug it further.

In essence, occasionally (20-30% of the time) OSv guest running on QEMU with KVM on Raspberry PI 4B host encounters "Unknown Reason" (ESR_EL1 = 0x2000000) sync exception when running the same test application (does not seem to be specific to any given application). It seems that the longer the application runs the more frequently this exception happens. Here is the state of the registers from the guest gdb debug session captured when exception encountered and after saving exception frame on the stack (sorted for convenience):

(gdb) info reg

ACTLR_EL1 0x0 0

ACTLR_EL2 0x0 0

ACTLR_EL3 0x0 0

AFSR0_EL1 0x0 0

AFSR1_EL1 0x0 0

AIDR 0x0 0

AMAIR0 0x0 0

CLIDR 0x0 0

CONTEXTIDR_EL1 0x0 0

CPACR 0x300000 3145728

CSSELR 0x2 2

CTR_EL0 0x0 0

DACR32_EL2 0x1de7ec7edbadc0de 2154950976315703518

DBGBCR 0x0 0

DBGBVR 0x0 0

DBGWCR 0x0 0

DBGWVR 0x0 0

ELR_EL1 0x1000000311c0 17592186245568

ESR_EL1 0x2000000 33554432

FAR_EL1 0x1000000321b8 17592186249656

FPCR 0x0 0

FPEXC32_EL2 0x700 1792

FPSR 0x14 20

ID_AA64AFR0_EL1 0x0 0

ID_AA64AFR1_EL1 0x0 0

ID_AA64AFR2_EL1_RESERVED 0x0 0

ID_AA64AFR3_EL1_RESERVED 0x0 0

ID_AA64DFR0_EL1 0x0 0

ID_AA64DFR1_EL1 0x0 0

ID_AA64DFR2_EL1_RESERVED 0x0 0

ID_AA64DFR3_EL1_RESERVED 0x0 0

ID_AA64ISAR0_EL1 0x10000 65536

ID_AA64ISAR1_EL1 0x0 0

ID_AA64ISAR2_EL1_RESERVED 0x0 0

ID_AA64ISAR3_EL1_RESERVED 0x0 0

ID_AA64ISAR4_EL1_RESERVED 0x0 0

ID_AA64ISAR5_EL1_RESERVED 0x0 0

ID_AA64ISAR6_EL1_RESERVED 0x0 0

ID_AA64ISAR7_EL1_RESERVED 0x0 0

ID_AA64MMFR0_EL1 0x1124 4388

ID_AA64MMFR1_EL1 0x0 0

ID_AA64MMFR2_EL1_RESERVED 0x0 0

ID_AA64MMFR3_EL1_RESERVED 0x0 0

ID_AA64MMFR4_EL1_RESERVED 0x0 0

ID_AA64MMFR5_EL1_RESERVED 0x0 0

ID_AA64MMFR6_EL1_RESERVED 0x0 0

ID_AA64MMFR7_EL1_RESERVED 0x0 0

ID_AA64PFR1_EL1 0x0 0

ID_AA64PFR2_EL1_RESERVED 0x0 0

ID_AA64PFR3_EL1_RESERVED 0x0 0

ID_AA64PFR5_EL1_RESERVED 0x0 0

ID_AA64PFR6_EL1_RESERVED 0x0 0

ID_AA64PFR7_EL1_RESERVED 0x0 0

ID_AA64ZFR0_EL1 0x0 0

ID_AFR0 0x0 0

ID_DFR0 0x0 0

ID_ISAR0 0x2101110 34607376

ID_ISAR1 0x13112111 319889681

ID_ISAR2 0x21232042 555950146

ID_ISAR3 0x1112131 17899825

ID_ISAR4 0x11142 69954

ID_ISAR5 0x10001 65537

ID_ISAR6 0x0 0

ID_MMFR0 0x0 0

ID_MMFR1 0x0 0

ID_MMFR2 0x0 0

ID_MMFR3 0x0 0

ID_MMFR4 0x0 0

ID_PFR0 0x0 0

IFSR32_EL2 0x1de7ec7edbadc0de 2154950976315703518

MAIR_EL1 0xff440c0400 1096358298624

MAIR_EL3 0x0 0

MDCCSR_EL0 0x0 0

MDCR_EL3 0x0 0

MDRAR_EL1 0x0 0

MDSCR_EL1 0x0 0

MVFR0_EL1 0x10110222 269550114

MVFR1_EL1 0x12111111 303108369

MVFR2_EL1 0x43 67

MVFR3_EL1_RESERVED 0x0 0

MVFR4_EL1_RESERVED 0x0 0

MVFR5_EL1_RESERVED 0x0 0

MVFR6_EL1_RESERVED 0x0 0

MVFR7_EL1_RESERVED 0x0 0

OSLSR_EL1 0xa 10

PAR_EL1 0x1d000c7edbadc0de 2089683966090920158

PMCCFILTR_EL0 0x0 0

PMCCNTR_EL0 0x1de7ec7edbadc0de 2154950976315703518

PMCEID0_EL0 0x20001 131073

PMCEID1_EL0 0x0 0

PMCNTENCLR_EL0 0x5badc0c0 1538113728

PMCNTENSET_EL0 0x5badc0c0 1538113728

PMCR_EL0 0x4102302c 1090662444

PMEVCNTR0_EL0 0x1de7ec7edbadc0de 2154950976315703518

PMEVCNTR1_EL0 0x1de7ec7edbadc0de 2154950976315703518

PMEVCNTR2_EL0 0x1de7ec7edbadc0de 2154950976315703518

PMEVCNTR3_EL0 0x1de7ec7edbadc0de 2154950976315703518

PMEVTYPER0_EL0 0x1de7ec7edbadc0de 2154950976315703518

PMEVTYPER1_EL0 0x1de7ec7edbadc0de 2154950976315703518

PMEVTYPER2_EL0 0x1de7ec7edbadc0de 2154950976315703518

PMEVTYPER3_EL0 0x1de7ec7edbadc0de 2154950976315703518

PMINTENCLR_EL1 0x5badc0c0 1538113728

PMINTENSET_EL1 0x5badc0c0 1538113728

PMOVSCLR_EL0 0x1de7ec7edbadc0de 2154950976315703518

PMOVSSET_EL0 0x1de7ec7edbadc0de 2154950976315703518

PMSELR_EL0 0x1de7ec7edbadc0de 2154950976315703518

PMUSERENR_EL0 0x0 0

REVIDR_EL1 0x0 0

RVBAR_EL1 0x0 0

SCTLR 0x30d0199d 818944413

SPSR_ABT 0x0 0

SPSR_EL1 0x80000345 2147484485

SPSR_FIQ 0x0 0

SPSR_IRQ 0x0 0

SPSR_UND 0x0 0

SP_EL0 0x0 0

SP_EL1 0x200000200d60 35184374189408

TCR_EL1 0x15b5103510 93232051472

TPIDRRO_EL0 0x1de7ec7edbadc0de 2154950976315703518

TPIDR_EL0 0xffff900040f78000 -123144212348928

TPIDR_EL1 0x1de7ec7edbadc0de 2154950976315703518

TTBR0_EL1 0x407bf000 1081864192

TTBR1_EL1 0x407b5000 1081823232

VBAR 0x40207000 1075867648

cpsr 0x800003c5 -2147482683

fpcr 0x0 0

fpsr 0x14 20

pc 0x402077a0 0x402077a0 <entry_invalid_from_sync>

sp 0x200000200d60 0x200000200d60

x0 0x1 1

x1 0x2000000 33554432

x10 0xffffa00040ca9e00 -105552029245952

x11 0xffffa00040e1a600 -105552027736576

x12 0x100000032048 17592186249288

x13 0x1 1

x14 0x4100000000 279172874240

x15 0x0 0

x16 0x100000050000 17592186372096

x17 0x402b0240 1076560448

x18 0x0 0

x19 0x200000200ed0 35184374189776

x2 0x0 0

x20 0xffff900040f78000 -123144212348928

x21 0x1 1

x22 0x1 1

x23 0xffff900040a2f090 -123144217890672

x24 0x12ca90abac 80707890092

x25 0x0 0

x26 0xffffa00040f79200 -105552026299904

x27 0xffff8000407cd739 -140736406431943

x28 0xffffa00040f79278 -105552026299784

x29 0x200000200e80 35184374189696

x3 0x0 0

x30 0x1000000311b8 17592186245560

x4 0x1000000321e2 17592186249698

x5 0x200000200ee2 35184374189794

x6 0x76 118

x7 0x34 52

x8 0x34 52

x9 0x1 1


OSv, being unikernel runs its kernel and application in the same memory space at EL1. And here is the exceptions vectors table:

.macro vector_entry label idx

/* every entry is at 2^7 bits distance */

.align 7

        b       \label

.endm

.global exception_vectors

.type exception_vectors, @function

.align 12

exception_vectors:

        /* Current Exception level with SP_EL0 : unused */

        vector_entry entry_invalid  0   // Synchronous

        vector_entry entry_invalid  1   // IRQ or vIRQ

        vector_entry entry_invalid  2   // FIQ or vFIQ

        vector_entry entry_invalid  3   // SError or vSError

        /* Current Exception level with SP_ELx : only actually used */

        vector_entry entry_sync     4

        vector_entry entry_irq      5

        vector_entry entry_fiq      6

        vector_entry entry_serror   7

        /* Lower Exception level in AArch64 : unused since we don't go to EL0 */

        vector_entry entry_invalid  8

        vector_entry entry_invalid  9

        vector_entry entry_invalid 10

        vector_entry entry_invalid 11

        /* Lower Exception level in AArch32 : no El0, no AArch32 */

        vector_entry entry_invalid 12

        vector_entry entry_invalid 13

        vector_entry entry_invalid 14

        vector_entry entry_invalid 15

The most difficult issue is that I do not know why we occasionally encounter this exception. The ARMV8 documentation gives many reasons (developer.arm.com/.../esr_el1 but none of the ones I tried to investigate seems to match what I am seeing.

One of the reasons could be unallocated instruction or instruction using not-allowed system register (like MSR, MRS) but that does not seem to be the case. The same application seems to encounter the Uknown Reason exception (if it does) at various places in the code and there does not seem to be anything consistent about it, except it is always the app code and never the kernel part (see addresses above 0x0000100000000000). Here are some examples of instructions per ELR_EL1 (ESR_EL1 is always 0x2000000):

Address, Instruction
---------------------------------------
0x0000100000031000 d63f0020 blr x1

0x0000100000031140 f9404c01 ldr x1, [x0, #152]


0x00001000000311c0 94000182 bl 17c8 <*ABS*@plt+0x7b8>

0x0000100000031600 b4000040 cbz x0, 1608 <*ABS*@plt+0x5f8>

0x0000100000031640 f00000e0 adrp x0, 20000 <*ABS*@plt+0x1eff0>

0x0000100000031200 2a1503e0 mov w0, w21

As you can see some instructions involve memory access but some do not (last example) and they all seem to be valid instructions.

The existing OSv sync exception handler looks like that:

121 .global entry_invalid_from_sync

122 .type entry_invalid_from_sync, @function

123 entry_invalid_from_sync:

124 mrs x20, elr_el1 // Exception Link Register -> X20

125 mrs x21, spsr_el1 // Saved PSTATE -> X21

126 mrs x22, esr_el1 // Exception Syndrome Register -> X22

127

128 ubfm x23, x22, #ESR_EC_BEG, #ESR_EC_END // Exception Class -> X23

129 ubfm x24, x22, #ESR_ISS_BEG, #ESR_ISS_END // Instruction-Specific Syndrome -> X24

130

131 1: wfi

132 b 1b

133

134 .global entry_sync

135 .type entry_sync, @function

136 entry_sync:

137 push_state_to_exception_frame

138 mrs x1, esr_el1

139 ubfm x2, x1, #ESR_EC_BEG, #ESR_EC_END // Exception Class -> X2

140 ubfm x3, x1, #ESR_FLT_BEG, #ESR_FLT_END // FLT -> X3

141 cmp x2, #ESR_EC_DATA_ABORT

142 b.eq handle_mem_abort

143 cmp x2, #ESR_EC_INSN_ABORT

144 b.ne entry_invalid_from_sync

145 handle_mem_abort:

146 cbz x3, entry_invalid_from_mem_abort

147 cmp x3, #3

148 b.hi entry_invalid_from_mem_abort

149

150 mov x0, sp // save exception_frame to x0

151 bl page_fault

152 pop_state_from_exception_frame

153 eret


which makes OSv hang when encountering the Uknown Reason exception (see line 131,132).

Now I discovered that if I change the entry_invalid_from_sync routine to ignore the exception, pop the frame and let processing continue (like in the modified code below), the app and kernel continues and triggers the exact same exception again in the exact same address (same ELR_EL1 value) over and over again (~5,000 to ~20,000,000 times) UNTIL it eventually ceases to happen and the app successfully completes and OSv terminates. As I wrote at the very beginning of this email, the same app never encounters the EC=0 exception in 70-80% of the runs.

121 .global entry_invalid_from_sync

122 .type entry_invalid_from_sync, @function

123 entry_invalid_from_sync:


124 mov x0, sp // save exception_frame to x0

125 bl entry_sync_invalid_handler

126 pop_state_from_exception_frame

127 eret


Any ideas on what could be wrong or on how to further debug this problem? I can not see anything interesting in the host kern.log nor in QEMU log (qemu.log).

Please note that I have never been able to reproduce this issue with the same code (kernel and app) in the emulated mode (TCG) so it always seems to happen with KVM acceleration only.

Below is some information about specifics of the host, qemu, etc:

The host is Ubuntu 20.04.1 running Raspberry PI 4B with 4GB of RAM booting from SSD:
qemu-system-aarch64 --version

QEMU emulator version 4.2.1 (Debian 1:4.2-3ubuntu6.8)

Copyright (c) 2003-2019 Fabrice Bellard and the QEMU Project developers


uname -a

Linux ubuntu 5.4.0-1022-raspi #25-Ubuntu SMP PREEMPT Thu Oct 15 13:31:49 UTC 2020 aarch64 aarch64 aarch64 GNU/Linux


lscpu

Architecture: aarch64

CPU op-mode(s): 32-bit, 64-bit

Byte Order: Little Endian

CPU(s): 4

On-line CPU(s) list: 0-3

Thread(s) per core: 1

Core(s) per socket: 4

Socket(s): 1

Vendor ID: ARM

Model: 3

Model name: Cortex-A72

Stepping: r0p3

CPU max MHz: 1500.0000

CPU min MHz: 600.0000

BogoMIPS: 108.00

Vulnerability Itlb multihit: Not affected

Vulnerability L1tf: Not affected

Vulnerability Mds: Not affected

Vulnerability Meltdown: Not affected

Vulnerability Spec store bypass: Vulnerable

Vulnerability Spectre v1: Mitigation; __user pointer sanitization

Vulnerability Spectre v2: Vulnerable

Vulnerability Srbds: Not affected

Vulnerability Tsx async abort: Not affected

Flags: fp asimd evtstrm crc32 cpuid


The qemu comman line:
qemu-system-aarch64 \

-m 128M \

-smp 1 \

--nographic \

-gdb tcp::1234,server,nowait \

-kernel /home/wkozaczuk/projects/osv/build/last/loader.img \

-append "--disable_rofs_cache /tests/tst-tls.so" \

-machine virt \

-device virtio-blk-pci,id=blk0,drive=hd0,scsi=off \

-drive file=/home/wkozaczuk/projects/osv/build/release/tests_disk.img,if=none,id=hd0,cache=none,aio=native \

-netdev user,id=un0,net=192.168.122.0/24,host=192.168.122.1 \

-device virtio-net-pci,netdev=un0 \

-device virtio-rng-pci \

-enable-kvm \

-cpu host \

-d all \

-D ./qemu.log


Here is also the OSv guest memory layout of the kernel, application, and app stacks just in case.

0x0000000040090000 0x00000000407ac000 [KERNEL]

0x0000100000000000 0x0000100000001000 [4.0 kB] flags=fmF perm=rx offset=0x00000000 path=/libvdso.so

0x000010000001f000 0x0000100000020000 [4.0 kB] flags=fmF perm=r offset=0x0000f000 path=/libvdso.so

0x0000100000020000 0x0000100000021000 [4.0 kB] flags=fmF perm=rw offset=0x00010000 path=/libvdso.so

0x0000100000030000 0x0000100000033000 [12.0 kB] flags=fmF perm=rx offset=0x00000000 path=/tests/tst-tls.so

0x000010000004f000 0x0000100000050000 [4.0 kB] flags=fmF perm=r offset=0x0000f000 path=/tests/tst-tls.so

0x0000100000050000 0x0000100000051000 [4.0 kB] flags=fmF perm=rw offset=0x00010000 path=/tests/tst-tls.so

0x0000100000060000 0x0000100000061000 [4.0 kB] flags=fmF perm=rx offset=0x00000000 path=/tests/libtls.so

0x000010000007f000 0x0000100000080000 [4.0 kB] flags=fmF perm=r offset=0x0000f000 path=/tests/libtls.so

0x0000100000080000 0x0000100000081000 [4.0 kB] flags=fmF perm=rw offset=0x00010000 path=/tests/libtls.so

0x0000100000090000 0x00001000000a3000 [76.0 kB] flags=fmF perm=rx offset=0x00000000 path=/usr/lib/libgcc_s.so.1

0x00001000000bf000 0x00001000000c0000 [4.0 kB] flags=fmF perm=r offset=0x0001f000 path=/usr/lib/libgcc_s.so.1

0x00001000000c0000 0x00001000000c1000 [4.0 kB] flags=fmF perm=rw offset=0x00020000 path=/usr/lib/libgcc_s.so.1

0x0000200000000000 0x0000200000001000 [4.0 kB] flags=p perm=none

0x0000200000001000 0x0000200000002000 [4.0 kB] flags=p perm=none

0x0000200000002000 0x0000200000101000 [1020.0 kB] flags=p perm=rw

0x0000200000101000 0x0000200000102000 [4.0 kB] flags=p perm=none

0x0000200000102000 0x0000200000201000 [1020.0 kB] flags=p perm=rw

My regards,
Waldemar Kozaczuk

PS. I have posted a similar question on QEMU ARM forum, but since then I have verified I can reproduce exact same issue when running OSv on Firecracker hypervisor (https://firecracker-microvm.github.io/) on the same host. Firecracker uses KVM acceleration exclusively. This makes me think that this issue probably is not caused by a bug in QEMU.

More questions in this forum