This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Linux kernel boot-up issue in ARM Cortex-A78 while accessing virtual memory

Hi All,

I am not sure if this is the perfect place to ask this question or not.
But as we are stuck at a point since long, I decided to post it here.
Please read the post and suggest me if I have to post this question at some other forum.

We are trying to boot Linux kernel-5.10.39 with U-Boot 2021.10-rc2 on Veloce emulator, with ARM Cortex-A78 core in it.

Below are the information which we have followed:

Hardware details:
    - an ARM Cortex-A78 core
    - an interconnect
    - boot SRAM
    - an NS16550-A UART
    -> to access this hardware, we use Veloce emulator.

Software details:
    -> We place bootcode, uboot.bin, kernel image and device tree files at a predefined memory location of RAM as shown in memory map, below.
    -> We are following this link ( https://linux-sunxi.org/U-Boot ) for u-boot configuration.
        Below are the commands, which we executed for our setup:
        - prepared ARM trusted firmware for Juno platform (as Juno is the best match for our system) using below command:
            # make CROSS_COMPILE=aarch64-linux-gnu- PLAT=juno DEBUG=1 bl31
            (We chose juno here, as both juno and our core uses ARMv8-A architecture and supports AArch64.)

        - used vexpress_aemv8a_juno_defconfig, with below command:
            # make CROSS_COMPILE=aarch64-linux-gnu- BL31=<path_to_arm-trusted-firmware>/build/sun50i_a64/debug/bl31.bin vexpress_aemv8a_juno_defconfig

        - modified CONFIG_SYS_TEXT_BASE, with a memory address where we place uboot.bin in SRAM
        - changed UART configurations as per our NS16550-A UART
  -> Kernel is compiled with "relocatable kernel = NO" option with 4KB page size and 39bit VA_BITs

Memory Map:
    0x000000000000 to 0x000000FFFFFF => memory Zone 1 => bootcode is placed at 0x00 location (ARM Core Reset address)
    0x974400000000 to 0x9747FFFFFFFF => memory Zone 2 => U-Boot Image is placed at 0x974400000000 (size 650KB)
                                                                                                   => configured U-Boot RAM top at 0x974408000000 (128MB)
                                                                                                   => Kernel Image is placed at 0x974409000000 (size 9.5MB)
    0x9A48AA140000 to 0x9A48AA17FFFF => memory Zone 3 => Device tree is placed at 0x9a48aa140000
    0x82884C000000 to 0x82884c0FFFFF  => memory Zone 4 => UART FIFO address is 0x82884c000000

Execution flow:
    1. A78 core comes out of reset
    2. executes a boot code present at its reset location
        - The boot code configures our interconnect and unblocks the SRAM memory zones according to the configurations
        - it allocates stack area and perform other necessary tasks to configure the A78 core registers
        - then branches to the memory location where u-boot.bin is already present
        NOTE: The A78 core is in EL3 after reset and the boot code doesn't change it
    3. start executing u-boot.bin
        - modified vexpress_aemv8a.h file according to our SRAM addresses
        - modified some other files as required in runtime
    4. starts executing kernel code

Problem:
    - control reaches to execute start_kernel() in ./linux-5.10.39/init/main.c file
    - in this start_kernel function the execution gets stuck while executing set_task_stack_end_magic() function
    - to be more specific, when the code tries to run "*stackend = STACK_END_MAGIC;" line of set_task_stack_end_magic() function, it generates an curr_el_spx_sync exception

Our observations:
    - the "*stackend = STACK_END_MAGIC;" is trying to store STACK_END_MAGIC value on a virtual address
    - if we replace the virtual address (which has upper 25bits set) with any other physical address (with upper 25bits clear), the execution gets stuck in next function smp_setup_processor_id()


We need help here, to resolve this issue.
Can any one help us out here ?
Please feel free to inform me, if I am unclear to you at any point or you need more information.

Parents
  • >- to be more specific, when the code tries to run "*stackend = STACK_END_MAGIC;" line of set_task_stack_end_magic() function, it generates an curr_el_spx_sync exception

    You need to check ESR_ELx register to find out the reason of the exception. I suspect that the MMU table was corrupted.

Reply
  • >- to be more specific, when the code tries to run "*stackend = STACK_END_MAGIC;" line of set_task_stack_end_magic() function, it generates an curr_el_spx_sync exception

    You need to check ESR_ELx register to find out the reason of the exception. I suspect that the MMU table was corrupted.

Children
No data