This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Baremetal program jumps to 0x200

Hello, I am trying to run a "hello world" program with C/C++ standard library support on Morello board (hardware), using Arm Development Studio Morello edition.

I previously followed the standalone-baremetal-readme.rst guide which worked well (following the advice from this topic), but it did not allow to use functions like "printf".

I tried to use examples from:

https://git.morello-project.org/morello/llvm-project-releases/-/tree/morello/baremetal-release-1.6?ref_type=heads

I ran make and the "make-bm-image.sh" with "-e" flag to produce "howdy-purecap-bm-image.elf" and "howdy-morello-bm-image.elf" (in the "make-bm-image.sh" script I added a line to preserve a copy of the .elf file), then I loaded these in the development studio.

It appears that the program goes to address 0x200 after executing the "MRS" instruction.

Does anyone know why that happens?

Also, in the standalone-baremetal-readme.rst guide it was necessary to specify UART address (0x2A400000) in the program, is it correct to assume that examples from baremetal-release-1.6 branch of llvm-project-releases will use that address (without the need to specify it anywhere in the program) and the printf/cout messages will appear in the AP com port of Morello hardware board? Or is it necessary to do some adjustments to achieve that?

Parents Reply Children
  • Hi there,

    Caveat of everything I say:
    I've checked things build with the libraries Kevin mentions, but not checked they run on the board (because I don't have easy access to a board).

    I just had a look for those libraries Kevin mentioned in the GCC toolchain distribute.  Those libraries are not in the package.
    One option would be to link to the libraries in the LLVM toolchain (if you have that), but rebuilding just newlib with the required flags isn't too hard and should leave you with a GNU something much easier to use.

    You shouldn't have to recompile much of the toolchain -- just these particular libraries.

    FWIW I've inlined a session demonstrating how to build newlib starting with the GNU toolchain we distribute.
    Hopefully the comments should explain what's happening and why.

    vshcmd: > # Recommend using a separate untar of the toolchain.
    vshcmd: > # (Replacing bundled binaries with newly built binaries.  I don't
    vshcmd: > # know for certain these binaries would work -- though I think it's a
    vshcmd: > # good enough chance to try.)
    vshcmd: > tar -xaf arm-gnu-toolchain-10.1.morello-alp2-x86_64-aarch64-none-elf.tar.xz
    morello-distro [14:06:32] $
    vshcmd: > # Just to de-mystify this helper function I happen to use.
    vshcmd: > type newdir
    newdir is a function
    newdir ()
    {
        mkdir -p "$@";
        cd "${@: -1}"
    }
    morello-distro [14:06:34] $
    vshcmd: > newdir rebuild-newlib
    rebuild-newlib [14:06:36] $
    vshcmd: > git clone git.morello-project.org/.../newlib.git
    Cloning into 'newlib'...
    <snip>

    vshcmd: > newdir newlib-build
    newlib-build [14:07:32] $
    vshcmd: > PATH="$(pwd)/../../arm-gnu-toolchain-10.1.morello-alp2-x86_64-aarch64-none-elf/bin:$PATH" \
    vshcmd: > CC_FOR_TARGET=aarch64-none-elf-gcc \
    vshcmd: >     CFLAGS="-O1 -g" \
    vshcmd: >     CFLAGS_FOR_TARGET="-ffunction-sections -fdata-sections -O2 -g -DWANT_CHERI_QUALIFIER_MACROS -D__cheri_fromcap=" \
    vshcmd: >     CXXFLAGS_FOR_TARGET="-ffunction-sections -fdata-sections -O2 -g -DWANT_CHERI_QUALIFIER_MACROS -D__cheri_fromcap=" \
    vshcmd: > ../newlib/configure --disable-newlib-supplied-syscalls \
    vshcmd: >                     --enable-newlib-retargetable-locking \
    vshcmd: >                     --enable-newlib-reent-check-verify \
    vshcmd: >                     --enable-newlib-io-long-long \
    vshcmd: >                     --enable-newlib-io-c99-formats \
    vshcmd: >                     --enable-newlib-register-fini \
    vshcmd: >                     --enable-newlib-mb \
    vshcmd: >                     --target=aarch64-none-elf \
    vshcmd: >                     --prefix=/ --with-pkgversion=unknown
    <snip>
    vshcmd: > # N.b. I happen to build the entirety of newlib.
    vshcmd: > # I expect that we could get away with only building libgloss, but
    vshcmd: > # building everything doesn't take too long.
    vshcmd: > PATH="$(pwd)/../../arm-gnu-toolchain-10.1.morello-alp2-x86_64-aarch64-none-elf/bin:$PATH" \
    vshcmd: >       make -j8 all-target-newlib all-target-libgloss
    <snip>
    vshcmd: > # One option is to install in a local directory.
    vshcmd: > PATH="$(pwd)/../../arm-gnu-toolchain-10.1.morello-alp2-x86_64-aarch64-none-elf/bin:$PATH" \
    vshcmd: > make DESTDIR=$(pwd)/../newlib-install \
    vshcmd: >     INSTALL="/usr/bin/install -C" \
    vshcmd: >     install-target-newlib \
    vshcmd: >     install-target-libgloss
    <snip>
    vshcmd: > # I believe you would be able to directly use these binaries using
    vshcmd: > # `-nostartfiles` to avoid the non-EL2 crt0.o file and then manually
    vshcmd: > # specifying the crti.o, crtbegin.o, and morello-el2-crt0.o file.
    vshcmd: > # (N.b. don't worry about the warning -- it's not something that
    vshcmd: > # should stop things working).
    vshcmd: > cd ../../
    vshcmd: > ./arm-gnu-toolchain-10.1.morello-alp2-x86_64-aarch64-none-elf/bin/aarch64-none-elf-gcc \
    vshcmd: >   -march=morello+c64 -mabi=purecap -nostartfiles -L rebuild-newlib/newlib-install/aarch64-none-elf/lib/purecap/c64/ \
    vshcmd: >   -Wl,arm-gnu-toolchain-10.1.morello-alp2-x86_64-aarch64-none-elf/lib/gcc/aarch64-none-elf/10.1.0/purecap/c64/crti.o,arm-gnu-toolchain-10.1.morello-alp2-x86_64-aarch64-none-elf/lib/gcc/aarch64-none-elf/10.1.0/purecap/c64/crtbegin.o,./rebuild-newlib/newlib-install/aarch64-none-elf/lib/purecap/c64/morello-el2-crt0.o,./rebuild-newlib/newlib-install/aarch64-none-elf/lib/purecap/c64/cpu-init/morello-init-el2.o,--start-group,-lc,-lgloss-morello-el2,--end-group,arm-gnu-toolchain-10.1.morello-alp2-x86_64-aarch64-none-elf/lib/gcc/aarch64-none-elf/10.1.0/purecap/c64/crtend.o,arm-gnu-toolchain-10.1.morello-alp2-x86_64-aarch64-none-elf/lib/gcc/aarch64-none-elf/10.1.0/purecap/c64/crtn.o test.c -o test-direct-alt
    morello-distro [14:28:06] $ > > <snip>/aarch64-none-elf/bin/ld: rebuild-newlib/newlib-install/aarch64-none-elf/lib/purecap/c64//libgloss-morello-el2.a(morello-el2-uart.o)(.rela.got+0x1c0): warning: relocation R_MORELLO_ADR_GOT_PAGE against symbol '__UART' in section without permission flags '*ABS*'.  Assuming Read-Write.
    morello-distro [14:28:06] $
    vshcmd: > # However, it seems like it would be neater to instead install into
    vshcmd: > # the GCC install directory.  Then create a new `specs` file so you
    vshcmd: > # don't have to always specify the required crt0 startup object
    vshcmd: > # files.
    vshcmd: > cd rebuild-newlib/newlib-build/
    vshcmd: > PATH="$(pwd)/../../arm-gnu-toolchain-10.1.morello-alp2-x86_64-aarch64-none-elf/bin:$PATH" \
    vshcmd: > make DESTDIR=$(pwd)/../../arm-gnu-toolchain-10.1.morello-alp2-x86_64-aarch64-none-elf \
    vshcmd: >     INSTALL="/usr/bin/install -C" \
    vshcmd: >     install-target-newlib \
    vshcmd: >     install-target-libgloss
    <snip>
    vshcmd: > cd ../../
    morello-distro [14:29:51] $
    vshcmd: > # Manually defined specs file (i.e. write the below in the relevant file).
    vshcmd: > # Includes the new crt0.o file and the new morello-init-el2.o file.
    vshcmd: > cat arm-gnu-toolchain-10.1.morello-alp2-x86_64-aarch64-none-elf/aarch64-none-elf/lib/morello-board-el2.specs
    %rename lib libc

    *libgloss:
    -lgloss-morello-el2

    *startfile:
    crti.o%s crtbegin.o%s morello-el2-crt0.o%s

    *lib:
    cpu-init/morello-init-el2.o%s --start-group %(libc) %(libgloss) --end-group

    morello-distro [14:30:05] $
    vshcmd: > ./arm-gnu-toolchain-10.1.morello-alp2-x86_64-aarch64-none-elf/bin/aarch64-none-elf-gcc \
    vshcmd: >   -march=morello+c64 -mabi=purecap -specs=morello-board-el2.specs test.c -o test
    > <snip>/aarch64-none-elf/bin/ld: <snip>/aarch64-none-elf/lib/purecap/c64/libgloss-morello-el2.a(morello-el2-uart.o)(.rela.got+0x1c0): warning: relocation R_MORELLO_ADR_GOT_PAGE against symbol '__UART' in section without permission flags '*ABS*'.  Assuming Read-Write.
    morello-distro [14:31:11] $

  • Thank you very much, I would never be able to do that on my own.

    I followed the instructions and got the newlib-install ready, I recompiled the program and it looks like it no longer uses the CPTR_EL3.

    But I encountered another issue while trying to run the "hello_world" program. The _start function executes succesfully, after that it enters ".pure" function and that's where LDR instruction is never completed, the program jumps to 0xE0002200, then to "curr_sp0_fiq" and the never ending loop because STP instruction from image below is also never completed.

    I made this permalink to the Makefile I used:

    https://github.com/michalmonday/morello_baremetal/blob/5c17d0600a0ae10549e414f70b3bf86fc58a253e/Makefile

    And this permalink to the objdump it produced:

    https://github.com/michalmonday/morello_baremetal/blob/5c17d0600a0ae10549e414f70b3bf86fc58a253e/hello_world.dump

    I tried running it with and without ".ds" script from this reply about semihosting (because curr_sp0_fiq was mentioned in that thread):

    set semihosting heap-base 0
    set semihosting heap-limit 0x80800000
    set semihosting stack-limit 0x80800000
    set semihosting stack-base 0x81000000
    set semihosting enabled on

    But in both cases the behaviour was the same.

  • It looks like an exception occurs here:

    https://git.morello-project.org/morello/newlib/-/blob/morello/master/libgloss/aarch64/crt0.S#L180

    Hard to say why without more information. Have a look at the value of the ESR_EL2 system register just after the exception is taken, feel free to copy it here.

  • I just tried to do the same thing I did before (using the same hello_world program), and somehow the LDR instruction does not cause exception anymore, no idea why. I am facing another issue, when I step through the program using "F5" (step into), multiple function calls execute and return well. But when I use "F6" (step over), each of the following functions result in jump to "curr_sp0_fiq":

    -_cpu_init_hook (the cpu_init_hook seems to have "ret" instruction only in the source code, but in the disassembler it calls _init_vectors and _flat_map, I attached image of it below)

    - _init_vectors

    - _flat_map

    - memset

    I didn't test using F6 with any other functions, but every function I tried to step-over resulted in exception.

    I tried to use breakpoints to find part of code that potentially cause this exception, but working with breakpoints seems unstable (sometimes running the code until breakpoint worked well, sometimes it crashed the ARM Development studio, sometimes it made program jump at address 0 and required rebooting the board).

    In this thread the same behaviour was described (where stepping through code worked, and running it resulted in exception), and the suggested solution was to introduce 2 ISB instructions, but from what I see in the "rebuild-newlib" I used, these 2 ISB instructions are already in the following file:

    newlib/libgloss/aarch64/crt0.S

    I checked system registers after the program went to "curr_sp0_fiq" (following F6/step-over functions) and the values were always the following:

  • I'm really not sure what is happening regarding stepping over. A few things I can say though:

    • _cpu_init_hook is actually defined here. The definition you were looking at is a fallback (weak symbol), in case it is not otherwise defined.
    • The ESR_EL2 value in your screenshot corresponds to a data abort, with the DFSC (0x2a) indicating a capability bound fault (DS should give you the decoding if you click on the + icon next to it). The address at which the access (write) failed is indicated by FAR_EL2, and I would guess that is somewhere on the stack. Maybe a store via CSP, whose bounds are not appropriate? The address of the instruction at which the fault occurred is indicated by ELR_EL2. That might tell you enough to figure out what happened.
  • It is a tricky issue to debug because pressing F6 to step-over function does not stop the execution, it just makes the code run forever (just like it happens on the image with red arrows), so after pressing the pause button, the state of registers is different from the state of registers when the "curr_sp0_fiq" was invoked for the first time (which makes it difficult to recognize what caused the issue in the first place).

    After stepping over the _cpu_init_hook function and pressing the pause button, here's what I can see:

    (I highlighted some registers with orange because the font is awkward without it and names can't be seen, changing theme didn't help)

    If I understand correctly the STP instruction in "write" function makes the program jump to "curr_sp0_fiq". And the STP instruction used C29, C30 and CSP capabilities. CSP appears to have the value 0xFF000000, and because the STP instruction specifies "-32" offset I checked the 0xFEFFFFE0 address contents (no idea if this is helpful in any way):

  • My previous reply to this message was hidden so it may appear after this one.

    I just realized that connecting to "Rainier_SMP_0" or "Rainierx4 Multi-Cluster SMP" (instead of "Rainier_0") causes the LDR instruction (the first instruction of ".pure" function that follows "_start") to jump at 0xE0002000 (and then at "curr_sp0_fiq").

    I don't know why but at least it gives the opportunity to view registers values when the issue first happens.

    This is the whole code that executes before LDR instruction fails:

    After the jump at 0xE0002000, this is the state of registers/memory:

    The ESR_EL2 mentions "Capability tag fault", do you know what could be the cause of it? Is it because C0 tag is not equal to 1 for some reason?

  • Ah that is indeed progress. Your first post suggests some kind of stack overflow, as CSP hits its lower bound. Clearly this is a consequence of something else going wrong, probably related to this strange stepping behaviour.

    The second post is a lot more straightforward: the function pointer C0 is null-derived, so BR C0 will cause PCC to become an invalid capability and thus cause an instruction abort right away. Of course the question is why C0 would be null-derived. Assuming the code sequence is correctly executed, this could only happen if DDC itself is null. Could you check the value of DDC_EL2?

  • Right, that'll be your problem. How that came to pass, I have no idea... The first step would be to check if it is valid at the very beginning of the execution. If not, there must be something going wrong with the firmware.

  • I rebooted the board and it seems that now DCC_EL2 is not null anymore (at the beginning of execution), and the LDR instruction is executed well.

    I wrongly assumed that changing "Rainier_0" to "Rainier_SMP_0" or "Rainierx4 Multi-Cluster SMP" made the LDR fail, I think it was coincidence, because changing these 3 options now does not make LDR fail anymore (following reboot which fixed DCC_EL2 being 0).

    But the issue where the function enters "curr_sp0_fiq" recursively after being stepped-over is still there.

    I've set a hardware breakpoint on the "curr_sp0_fiq" and 0xE0002200, I used F5 to step until the first function call (which was _cpu_init_hook) and pressed F6 to step-over it, the breakpoint got triggered and registers values are:

    I tried expanding the column and copying the text to see if there's something after "following..." but there was nothing, this is what it looks like when using tooltip:

    The DDC_EL2 seems to keep the same value as at the beginning of execution, I think there may be 2 separate issues, DDC sometimes being 0 (which gets fixed by rebooting the board), and this unidentified issue when stepping-over function call. The ELR_EL2 points to the instruction just after the _cpu_init_hook call.

    I did another experiment, where I pressed "continue" button (from the beginning of "_start" function) instead of using F5 to reach the 1st function call. Interestingly, in this case the breakpoint is hit with ELR_EL2 having much higher value (part of _get_s function), where DDC_EL2 becomes 0.

    Apologies for bombarding with all these screenshots/reports but it's all black magic to me, and I can't understand why such weird issues could occur.