This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

ARM v8 secondary CPU bootup

Hi experts,

     i am trying wakeup the secondary CPU core in bootloader, i am able to do this through a trusted firmware. The problem comes after wakeup!

Once the cpu is up it will be in EL-2 mode and it executes a predefined function, in which i will load the stack pointer(SP) with a valid address and transfer the control to a function written in c-code. once the controls  is transferred it is not executing the any of further functions. Are there any additional settings which need to be done before we pass the control to actual function?

below is the code which is executed by the secondary cpu.

.global reset_handler_smp
reset_handler_smp:
        ldr     x3, =( svc_secondary_stack )    /* stack base ptr */
        mrs     x1, mpidr_el1                   /* read mpidr reg */
        and     x0, x1, #0xf                    /* cpuid =mpidr[3:0] */
        mov     x2, #STACK_SIZE                 /* stack size */
        mul     x0, x0, x2
        add     x3, x3, x0                      /* x3 - per cpu stack */
        bic     sp, x3, #7                      /* 8bit aligned stack */
        bl       _secondary_cpu_entry
looop: b looop

can some one please help me

Parents
  • Hi Mr. Ash,

                        Thanks for the inputs.

    • AArch64 uses a "Full Descending" stack
      The "full" means that the SP points to the most-recently-pushed item (unlike an "empty" stack, where SP points to the next empty slot for an item to be pushed to). The "descending" means that the SP is decremented when pushing items, and incremented when popping (unlike an "ascending" stack, which is the reverse)

      So, your code for initialising the SP is wrong, because CPU #0's SP will point to the base of your stack, and then pushing anything to the stack will take you outside of your stack region. To fix this, you should be loading the limit of the stack into X3, and then subtracting the multiple of (CPUID * STACK_SIZE) from this limit.
    • The bootloader which i am working is a bare metal bootloader. I wanted the secondary stack to be separate from main(cpu #0) stack (to avoid overlapping ).  so I placed it in data segment  by declaring an array of 128KB  "svc_secondary_stack [128*1024]"
    • so "X0" points at base of the stack  ie. X0 + STACK_SIZE = stack_top.
    • The CPU_ID is #1 since it is secondary core. I feel the multiplication instruction is redundant since it is a separate stack all together.
    • since I had copied this code snippet from other platform that remained as it is. I hope the stack initialization process is fine!! correct me if am wrong.

    You are aligning your stack to 8-bytes, but the AAPCS (ARM AArch64 Procedure Call Standard) mandates that the SP be 16-byte aligned

              Yes 16bit alignment was the mistake done it's been corrected.  Now there is an exception caused by an functions where the ESR_EL value is 0x96000035.

    • Did you mean to put a BL to _secondary_cpu_entry, rather than just a B?
      I wouldn't expect the secondary CPUs to ever return to this reset handler code, so a BL is probably wrong? And you actually just want a B

              I agree  with you just B was sufficient as i mentioned earlier it was copied code snippet so it just remained thanks for pointing out.  


    Thanks,

    Harish.

Reply
  • Hi Mr. Ash,

                        Thanks for the inputs.

    • AArch64 uses a "Full Descending" stack
      The "full" means that the SP points to the most-recently-pushed item (unlike an "empty" stack, where SP points to the next empty slot for an item to be pushed to). The "descending" means that the SP is decremented when pushing items, and incremented when popping (unlike an "ascending" stack, which is the reverse)

      So, your code for initialising the SP is wrong, because CPU #0's SP will point to the base of your stack, and then pushing anything to the stack will take you outside of your stack region. To fix this, you should be loading the limit of the stack into X3, and then subtracting the multiple of (CPUID * STACK_SIZE) from this limit.
    • The bootloader which i am working is a bare metal bootloader. I wanted the secondary stack to be separate from main(cpu #0) stack (to avoid overlapping ).  so I placed it in data segment  by declaring an array of 128KB  "svc_secondary_stack [128*1024]"
    • so "X0" points at base of the stack  ie. X0 + STACK_SIZE = stack_top.
    • The CPU_ID is #1 since it is secondary core. I feel the multiplication instruction is redundant since it is a separate stack all together.
    • since I had copied this code snippet from other platform that remained as it is. I hope the stack initialization process is fine!! correct me if am wrong.

    You are aligning your stack to 8-bytes, but the AAPCS (ARM AArch64 Procedure Call Standard) mandates that the SP be 16-byte aligned

              Yes 16bit alignment was the mistake done it's been corrected.  Now there is an exception caused by an functions where the ESR_EL value is 0x96000035.

    • Did you mean to put a BL to _secondary_cpu_entry, rather than just a B?
      I wouldn't expect the secondary CPUs to ever return to this reset handler code, so a BL is probably wrong? And you actually just want a B

              I agree  with you just B was sufficient as i mentioned earlier it was copied code snippet so it just remained thanks for pointing out.  


    Thanks,

    Harish.

Children
  • Thanks for the help Mr.Ash Wilding

  • Hello Harish,

    • The bootloader which i am working is a bare metal bootloader. I wanted the secondary stack to be separate from main(cpu #0) stack (to avoid overlapping ).  so I placed it in data segment  by declaring an array of 128KB  "svc_secondary_stack [128*1024]"
    • so "X0" points at base of the stack  ie. X0 + STACK_SIZE = stack_top.

    AArch64 is a full descending stack, so you will want to initialise SP to be at stack_top, i.e. (svc_secondary_stack + STACK_SIZE).

    • The CPU_ID is #1 since it is secondary core. I feel the multiplication instruction is redundant since it is a separate stack all together.

    This is a common trick used in multiprocessor startup code. The idea is that you allocate a single large chunk of memory containing each CPU's stack, and then initialise each CPU's SP to be an index within that chunk. So in a system where each CPU will have a 4KB stack, CPU0 would have its SP initialised to stack_top, CPU1 would have its SP initialised to (SP - 4KB), CPU2 would have its SP initialise to (SP - 8KB), and so on.

    Note that if you wish to use the C standard library when doing this, you will need to retarget the __user_setup_stackheap function, as outlined on this ARM Infocenter page. (The article uses R0, R2, and R3 registers as it was written for legacy architecture, simply use X0, X2, and X3 instead).

    Yes 16bit alignment was the mistake done it's been corrected.  Now there is an exception caused by an functions where the ESR_EL value is 0x96000035.

    According to the ARMv8-A Architecture Reference manual (ARM DDI 0847A.f, page D7-1969), that corresponds to an IMPLEMENTATION DEFINED fault (Unsupported Exclusive access fault), which is probably going to require you to do some investigating. If you're not able to solve this yourself, you'll probably want to contact support@arm.com (assuming you have an entitlement to support).

    I hope that helps,

    Ash.