Hi,
I trying to switch from EL2 to EL1 on Cortex-A53. But it doesn't work.
Here is my current startup code:
#include <asm.h> IMPORT_ASM(_cpu_el3_vec_tbl_set) IMPORT_ASM(_cpu_el2_vec_tbl_set) IMPORT_ASM(_cpu_el1_vec_tbl_set) IMPORT_C(init) IMPORT_C(main) .text ENTRY(_start) mrs x0, MPIDR_EL1 and x0, x0, #0x3 cmp x0, #0 beq __elx __wfe_cpu1_3: wfe b __wfe_cpu1_3 __elx: __el3: mrs x0, CurrentEL and x0, x0, #0xC asr x0, x0, #2 cmp x0, #3 bne __el2 __el3_stack: ldr x0, =_stack_el3_e mov sp, x0 __el3_vector: bl _cpu_el3_vec_tbl_set msr SCTLR_EL2, xzr msr HCR_EL2, xzr mrs x0, SCR_EL3 orr x0, x0, #(1<<10) orr x0, x0, #(1<<0) msr SCR_EL3, x0 mov x0, #0b01001 msr SPSR_EL3, x0 adr x0, __el2 msr ELR_EL3, x0 eret __el2: mrs x0, CurrentEL and x0, x0, #0xC asr x0, x0, #2 cmp x0, #2 bne __el1 __el2_stack: ldr x0, =_stack_el2_e mov sp, x0 __el2_vector: bl _cpu_el2_vec_tbl_set msr SCTLR_EL1, xzr mov x0, xzr orr x0, x0, #(1 << 31) msr HCR_EL2, x0 /*orr x0, x0, #(7 << 6) */ adr x0, __el1 msr ELR_EL2, x0 mov x0, xzr orr x0, x0, #(1 << 2) orr x0, x0, #(1 << 0) msr SPSR_EL2, x0 eret __el1: __el1_stack: ldr x0, =_stack_el1_e mov sp, x0 __el1_vector: bl _cpu_el1_vec_tbl_set ldr x0, =_bss_s ldr x1, =_bss_e sub x1, x1, x0 mov x2, #0x0 cbz x1, __init __bss: strb w2, [x0], #1 sub x1, x1, #1 cbnz x1, __bss __init: bl init __main: bl main __main_wfe: wfe b __main_wfe .end
When I comment out from label __elx to __el1 everything works (the C functions init and main are called).
I also setup a vector table and handlers for EL2 and EL1. But there is no exception.
I have no idea what's going wrong.
What's the correct way change to a lower exception level?
Maybe set also the I/F bits in SPSR_EL2 (6 and 7). To be sure, no interrupt happens.
My boot code also clears SCTRL_EL1 before switching.
But shouldn't be these exceptions catched in VBAR_EL2? I also setup a vector table and handlers for EL2 (bl _cpu_el2_vec_tbl_set).
The handler for EL2 are also called, when i force (quick and dirty) an exception.
Have tried it. But without success.
The current code:
__el2: mrs x0, CurrentEL and x0, x0, #0xC asr x0, x0, #2 cmp x0, #2 bne __el1 __el2_stack: ldr x0, =_stack_el2_e mov sp, x0 __el2_vector: bl _cpu_el2_vec_tbl_set msr SPsel, #1 mov x0, xzr orr x0, x0, #(1 << 11) orr x0, x0, #(1 << 20) orr x0, x0, #(1 << 22) orr x0, x0, #(1 << 28) orr x0, x0, #(1 << 29) msr SCTLR_EL1, x0 mov x0, xzr orr x0, x0, #(1 << 31) msr HCR_EL2, x0 adr x0, __el1 mov x1, xzr orr x1, x1, #(3 << 6) /* Mask IRQ/FIQ */ orr x1, x1, #(1 << 2) orr x1, x1, #(1 << 0) msr SPSR_EL2, x1 msr ELR_EL2, x0 eret __el1:
Now I used the initialization for SCTRL_EL1 from Linux.
But also not working. Still hangs.
mov x0, xzr orr x0, x0, #(1 << 11) orr x0, x0, #(1 << 20) orr x0, x0, #(1 << 22) orr x0, x0, #(1 << 28) orr x0, x0, #(1 << 29) msr SCTLR_EL1, x0
Please check your setup, you are on a CA53 (Armv8.0), so (f.e.) bit 28 is not defined in SCTLR. It should not matter, but it is no good idea to set reserved bits.Simply set SCTLR to zero seems best.
Okay, removed the setup of these bits. But still the same issue. :(
Can you output some feedback (UART, LED) directly after reaching EL1 (w/o doing anything else)?
Okay, EL1 will be entered. But I have some strange (for me) issues. First: The execution in EL1 is slower than in EL2. Second: Some parts (my sprintf implementation) are not called. But no issues when I use these function in EL2. Some configuration issue in HCR_EL2 and/or SCTLR_EL1?
krjdev said:Okay, EL1 will be entered
What was the reason? Please tell us (other might have the same problem).
Slower: Likely because a) no MMU setup and b) not I-Cache.
krjdev said:Some parts (my sprintf implementation) are not called.
This is very unspecific.
42Bastian Schick said:What was the reason? Please tell us (other might have the same problem).
EL1 was always entered. But there was (is still) an issue with some parts of my C code, so I got no output via the SoC's UART. Strangely these C functions works without issues in EL2. Also on other architectures. But maybe this issue exists because currently I don't enabled the MMU and D/I-Cache.
42Bastian Schick said:Slower: Likely because a) no MMU setup and b) not I-Cache.
Okay, I will add these in my current setup. Is the invalidating and cleaning of the I/D Cache also required for each exception level?
42Bastian Schick said:This is very unspecific.
I know, but I currently I don't understand why these functions works in EL2.
krjdev said:Is the invalidating and cleaning of the I/D Cache also required for each exception level?
Yes. The state of the cache RAM is unknown.
Thank you.
Did you find a resolution in the responses here? If so, it would be great if you could mark any as correct :)
Hi,krjdev
I met the same problem like you.You said "But there was (is still) an issue with some parts of my C code, so I got no output via the SoC's UART." after entering el1. Can you tell me how you discovered that? How to fixed this problem that no output via the SoC's UART?
Sorry for my late response.
I had some problems from mixing C functions with my startup code and other assembly code parts. I don't saved (pushed) the frame pointer register (x29) and link register (x30) correctly on the stack, when I called C functions. So when I called a C function from assembly the C function never returned back to the caller. So I got no output. Hope I could explain that correctly.
Might be an newbie issue...
Hi, krjdev.
Looks like I meet the same problem with you. Can you give me some tips on how to solve the problem?
Thanks.