How to build Pure-Capability CheriBSD for the Morello Platform to compile, debug and execute software

Hi all,

I apologise in advance, this might be long-winded as I am fairly new to CheriBSD and Morello. I have an AWS machine running Ubuntu 18 on an Intel x64 system.

My aim is to have a CheriBSD instance running the Arm Morello FVP, and to then compile, debug and run various pieces of software. Obviously, some of the source code will have to be amended to account for the CHERI changes to pointers and accepted data types.

A quick read of the CheriBSD GitHub tells me that ./cheribuild.py is my go-to for building all things CheriBSD. I've cloned the cheribuild GitHub repo so that I can use the script to install what I need.

At the moment, I'm running:

./cheribuild.py run-morello-purecap -d

...but my questions are:

What is that coming with? Do I already have everything I need as part of the development toolchain to compile (Clang/LLVM), debug (gdb)? Does that create a disk image? How do I find everything that is being included when running that command?

How do I find the dependencies that are being included and built when using -d?

When running ./cheribuild.py --list-targets, it displays multiple of what look like closely similar options. How do I know which one I need to specify? What is the difference between llvm-native, llvm-morello-purecap, morello-llvm-morello-purecap and llvm, for example? Are some of these simply aliases and mean the same thing? Are some being added to the host machine, and some being added to the CheriBSD instance?

How do the disk-images work with ./cheribuild disk-image-<architecture>? Can I boot from this to save the machine state? How can I save the state of the machine so that I can reboot, and pick back up from where I left off? This would be nice to know so I don't have to build every time I exit or have to use a command like screen, or is this simply not possible.

What is the difference between ./cheribuild.py run-morello-purecap -d  and ./cheribuild.py cheribsd-morello-purecap -d?

If I build using ./cheribuild run-morello-purecap -d and then realise I would like to build another target such as Nginx for example and add it to the instance already built, do I simply run: ./cheribuild nginx-morello-purecap, and it will "amend" the CheriBSD instance, or is there another method I should be using instead?

Does all compilation, debugging etc. take place on the host machine, rather than CheriBSD itself?

--------------------------

I appreciate there are a lot of questions here and a lot to go through, so I appreciate anyone offering help to somebody new to the area! 

If there is any documentation, guides or tutorials that answers my questions, please post the link as I am trying to be a self-learner as much as possible :)

Thanks!

  • Hi,

    Great to hear that you have been trying cheribuild. I've tried to answer all you questions below:

    At the moment, I'm running:

    ./cheribuild.py run-morello-purecap -d

    ...but my questions are:

    What is that coming with? Do I already have everything I need as part of the development toolchain to compile (Clang/LLVM), debug (gdb)? Does that create a disk image? How do I find everything that is being included when running that command?

    The -d flag of cheribuild means "build all targets that are dependencies first", so run-fvp-morello-purecap will build the toolchain and disk image first. By default everything will be placed under ~/cheri, but you can change that using command line flags or the ~/.config/cheribuild.json config file.

    One more I would like to point out is that `run-morello-purecap` will run cheribsd on our experimental QEMU Morello. While it appears to mostly match the Arm FVP, there could be some subtle behaviour differences.

    How do I find the dependencies that are being included and built when using -d?

    cheribuild has a -pretend (-p) flag that will show you what commands would be executed. There is also a flag --print-targets-only flag that is less verbose. For example cheribuild.py run-fvp-morello-purecap -d --print-targets-only will tell you that it will execute the following targets:

    install-morello-fvp
    morello-llvm-native
    cheribsd-morello-purecap
    gdb-morello-hybrid-for-purecap-rootfs
    morello-firmware
    disk-image-morello-purecap
    run-fvp-morello-purecap

    Do note however that cheribuild will not build everything that is needed e.g. to compile LLVM. For some dependencies we rely on the host system packages. We have listed those packages under https://github.com/CTSRD-CHERI/cheribuild#pre-build-setup If you are missing some of those packages the build should fail with a (hopefully readable) error message.

    When running ./cheribuild.py --list-targets, it displays multiple of what look like closely similar options. How do I know which one I need to specify? What is the difference between llvm-native, llvm-morello-purecap, morello-llvm-morello-purecap and llvm, for example? Are some of these simply aliases and mean the same thing? Are some being added to the host machine, and some being added to the CheriBSD instance?

    Generally, the `-d` flag will take care of building all dependencies, so you should only need to specify the target you are interested in. Usually that will be the project you are trying to port, followed by `disk-image-morello-purecap` to include it in the image and then `run-morello-fvp-purecap` to boot it.

    Regarding the different LLVM targets: llvm-native buids CHERI LLVM (RISC-V and MIPS) for your machine, llvm-morello-purecap builds CHERI LLVM as a purecap Morello binary for CheriBSD (note however that running this compiler inside the emulator will be extremely slow, so you should be cross-compiling instead). `morello-llvm-native` will build Arm's fork of CHERI LLVM that adds Morello support and `morello-llvm-morello-purecap` cross-compiles that for CheriBSD (again not recommended). In summary: all `morello-*-native` targets build fo the host machine, if they are named `*-morello-purecap` they are cross-compiled for CheriBSD.

    How do the disk-images work with ./cheribuild disk-image-<architecture>? Can I boot from this to save the machine state? How can I save the state of the machine so that I can reboot, and pick back up from where I left off? This would be nice to know so I don't have to build every time I exit or have to use a command like screen, or is this simply not possible.

    The disk image target creates a QEMU image from the CheriBSD install directory. By default all changes you make while running CheriBSD are persistent. Do note however that running `cheribuild.py disk-image-<suffix>` will overwrite the disk image. If you want to cheribuild to warn before overwriting the disk image, you can add "--disk-image/no-force-overwrite" to the cheribuild command line or add the following to cheribuild.json:

    { "disk-image": { "force-overwrite": false } }

    What is the difference between ./cheribuild.py run-morello-purecap -d  and ./cheribuild.py cheribsd-morello-purecap -d?

    The former will boot the disk-image after building all dependencies, the latter will only compile CheriBSD (and the dependencies, i.e. the toolchain). The latter also doesn't build a disk image. If you use `--print-targets-only` you can see that cheribsd-morello-purecap will be built as part of `run-morello-purecap -d`.

    If you don't want to rebuild everything and just boot CheriBSD, you can omit the `-d` flag.

    If I build using ./cheribuild run-morello-purecap -d and then realise I would like to build another target such as Nginx for example and add it to the instance already built, do I simply run: ./cheribuild nginx-morello-purecap, and it will "amend" the CheriBSD instance, or is there another method I should be using instead?

    The nginx target will install all files into the cheribsd install directory, but it will not update the disk image. In order to run the just-built nginx, you can run `cheribuild.py disk-image-morello-purecap run-fvp-morello-purecap` to boot a new disk image that will then include nginx. To get nginx into your existing instance without rebuilding the disk image you can either scp the files over or use a network mount.

    Does all compilation, debugging etc. take place on the host machine, rather than CheriBSD itself?

    While we can cross-compile LLVM as a CheriBSD AArch64 binary (purecap is still experimental) I would strongly recommend against it. Cross-compilation is the preferred approach since it is orders of magnitude faster and the CheriBSD disk images will rarely be large enough to compile anything non-trivial.

    For debugging, I generally run GDB inside CheriBSD. For large binaries, loading the debug information on startup can take a while but otherwise it is perfectly usable. To avoid having to wait for manual input when debugging a crashing program, I've added a `gdb-run.sh <command> <args>` script to the CheriBSD disk images that will start the program and run it once the symbol information has been loaded and will then print a stacktrace once the program halts.

    If there is any documentation, guides or tutorials that answers my questions, please post the link as I am trying to be a self-learner as much as possible :)

    The documentation for cheribuild is mostly just the README file. If you have any suggestions or improvements I'd be very happy to merge any patches :) We also recently started gathering usful information on a GitHub Wiki (https://github.com/CTSRD-CHERI/cheripedia/wiki/FAQ:-Standing-Up-Your-Copy

    Hope I've answered all you questions.  If you run into any problems related to cheribuild (or missing features), please open an issue https://github.com/CTSRD-CHERI/cheribuild/issues.

  • Wow! Thank you so much for this detailed feedback on all the questions, it is greatly appreciated :) I now feel more confident getting up and running with CheriBSD. 

    However, will all things like this, it has raised a few more questions:

    ~/.config/cheribuild.json config

    Any ideas why I don't have this? Do I have to place it there myself?

    I get this message when using ./cheribuild.py:

    Warning: Configuration file /home/ubuntu/.config/cheribuild.json does not exist, using only command line arguments.

    In summary: all `morello-*-native` targets build fo the host machine, if they are named `*-morello-purecap` they are cross-compiled for CheriBSD.

    When running --list-targets, there are also some entries that don't feature an architecture or platform, such as:

    ./cheribuild.py --list-targets

    There are 1386 available targets:

    .......

    qemu

    go

    ninja

    .......

    How do I know which platform these are to be installed on? Are they purecap, hybrid, native? Are they for MIPS, RISC-V, Morello? Are they for the host machine or CheriBSD?

    In order to run the just-built nginx, you can run `cheribuild.py disk-image-morello-purecap run-fvp-morello-purecap` to boot a new disk image that will then include nginx

    Taking this into account, what is the recommended way to close down the CheriBSD session (I am using PuTTY) so that I can return to the host machine to build Nginx with cheribuild before overwriting the disk image with ./cheribuild.py nginx-morello-purecap disk-image-morello-purecap? And then after this, what is the best way to quickly get CheriBSD booted again so I can continue my work? If I shut down CheriBSD without overwriting the disk, will all my work be lost?

    Also, seeing as running ./cheribuild.py run-fvp-morello-purecap executes the disk-image-morello-purecap target, if I want to boot CheriBSD without overwriting the disk, should I run ./cheribuild.py cheribsd-morello-purecap?

    Cross-compilation is the preferred approach since it is orders of magnitude faster and the CheriBSD disk images will rarely be large enough to compile anything non-trivial.

    When cross-compiling from the host machine to CheriBSD, is the recommended workflow to have two sessions running, one for the host machine, and one for CheriBSD? Does the CheriBSD instance have to be up and running when cross-compilation occurs?

    Thanks again for all your help with this, I really appreciate it :)

  • Any ideas why I don't have this? Do I have to place it there myself?

    I get this message when using ./cheribuild.py:

    Warning: Configuration file /home/ubuntu/.config/cheribuild.json does not exist, using only command line arguments.

    The config file will not be created automatically, so if you want to override some defaults persistently you need to create it manually.

    This warning message should not be a warning, just an information message. I didn't notice that it is warning since I have a config file set up everywhere I use cheribuild. I just pushed a fix and this should no longer say "warning".

    .......

    qemu

    go

    ninja

    .......

    How do I know which platform these are to be installed on? Are they purecap, hybrid, native? Are they for MIPS, RISC-V, Morello? Are they for the host machine or CheriBSD?

    If there is no target suffix, this (almost) always means that cross-compilation is not supported yet so it will be a host binary.

    Taking this into account, what is the recommended way to close down the CheriBSD session (I am using PuTTY) so that I can return to the host machine to build Nginx with cheribuild before overwriting the disk image with ./cheribuild.py nginx-morello-purecap disk-image-morello-purecap? And then after this, what is the best way to quickly get CheriBSD booted again so I can continue my work? If I shut down CheriBSD without overwriting the disk, will all my work be lost?

    Also, seeing as running ./cheribuild.py run-fvp-morello-purecap executes the disk-image-morello-purecap target, if I want to boot CheriBSD without overwriting the disk, should I run ./cheribuild.py cheribsd-morello-purecap?

    CheriBSD will write back all changes to the disk image regularly. However, if you kill the emulator, you might lose some in-flight changes or corrupt the image. I would recommend using the `poweroff` command to shut down cleanly. But if you are rebuilding the disk image anyway to include new files you can just kill the emulator (CTRL+A followed by X for QEMU and CTRL+] followed by q<ENTER> for the FVP).

    If you want to boot without overwriting the disk image you can run `cheribuild.py run-fvp-morello-purecap` without "-d" - in that case dependencies such as disk-image-morello-purecap are skipped.

    When cross-compiling from the host machine to CheriBSD, is the recommended workflow to have two sessions running, one for the host machine, and one for CheriBSD? Does the CheriBSD instance have to be up and running when cross-compilation occurs?

    Yes, I would absolutely recommend using multiple sessions. There is no need to have CheriBSD running when you cross-compile since this does not affect the running CheriBSD instance (unless you have the installation directory mounted in the CheriBSD VM - for QEMU this can be done by running the `qemu-mount-rootfs.sh` script).

  • I have a hello-world.c program I'd like to cross-compile across to the CheriBSD OS. What does this process look like? Is there a guide I can follow?

    If you want to compile a simple C/C++ file, cheribuild installs some helper scripts in output/morello-sdk/utils/cheribsd-morello-purecap-clang(++) that will add the necessary compiler flags to select right ABI and CheriBSD sysroot (the flags are defined in output/morello-sdk/bin/cheribsd-morello-purecap.cfg). If you want to compile something more complicated, I would recommend adding a new target to cheribuild. I've added a short wiki page that shows how to add new targets, I hope it is helpful: https://github.com/CTSRD-CHERI/cheribuild/wiki/Adding-a-new-cheribuild-target.

    We also have a few CHERI exercises here: https://ctsrd-cheri.github.io/cheri-exercises/introduction/index.html . The guide mostly talks about CHERI-RISC-V but applies equally to Morello.

    Also, how does the cheribsd-sdk-<architecture> come into play during this whole process?

    This target is effectively and alias to build a compiler and a sysroot. If you already booted CheriBSD, you will have both of these so you won't need to run cheribsd-sdk-<foo>.