How to make changes to the disk/root filesystem of a CheriBSD instance

Hi all,

I am currently running the Morello FVP on a CheriBSD instance, built with the cheribuild script. 

I added a test file (helloworld.txt) to the root filesystem of the machine under ~/cheri/output/rootfs-morello-purecap/root

I then rebuilt the machine and disk image with ./cheribuild.py run-fvp-morello-purecap -d

I assumed this would take into account the changes I've made, and would rebuild the machine/disk image to include them. However, when going to /root on the CheriBSD instance after rebuilding, the file isn't there. It has also been removed from the host system under ~/cheri/output/rootfs-morello-purecap/root

Why is this happening? And what is the best way to make amendments to a CheriBSD instance, such as adding files, so that when the machine is rebuilt/booted with the cheribuild script they are included?

Thanks!

Parents Reply Children
  • Brilliant! I've got it working now, thank you :)

    Great :)

    So besides extra-files and output, what is the rundown of the rest of the ~/cheri directory and each section's purpose? 

    Most directories are just source directories that are clones of various git repos (CheriBSD, QEMU, our LLVM, Arm's fork of our LLVM). The exceptions are:

    • build - as you might expect, where the various out-of-tree build directories exist
    • extra-files - as discussed above
    • extra-files-minimal - same as extra-files, but for the minimal disk images cheribuild can optionally produce (primarily useful for running on FPGAs where the entire filesystem needs to fit in memory)
    • output - where various native tools live
    Inside the output/morello-sdk/bin/ directory, is that where native binaries are installed such as if I ran the command ./cheribuild.py python-native?

    The morello-sdk directory is specifically for things that conflict with the CHERI versions of tools (e.g. sdk/bin/clang is our CHERI-LLVM but morello-sdk/bin/clang is Arm's Morello-extended CHERI-LLVM), as well as the odd Morello-specific thing (that's where cheribuild installs the FVP and firmware). These are all things that are intended to be run natively (but, in the case of the compiler, produce binaries that run on a CHERI or Morello system). Since python-native falls into neither of those special cases, it will be installed to sdk not morello-sdk.

    Also, the binaries already in ~/cheri/output/morello-sdk/bin/ such as clang & lld, are they no different than if I installed LLVM normally on my native machine, for example? Or are they "capability-enabled"? Also, are these the binaries I would use to cross-compile from my host machine to CheriBSD? It's confusing because I can install LLVM/Clang for my host machine, LLVM/Clang for my host machine but it understands pure capability code, and I can install LLVM/Clang for the CheriBSD instance (unless I've got that wrong :D)

    The binaries there are built from Arm's Morello-extended CHERI-LLVM and run natively on your current machine, intended to be used for cross-compiling but if you really want to will also work for compiling native X86 binaries, which I believe is what you mean by your middle option. Cross-compiling LLVM to run from within CheriBSD is a thing you can do, but it's not recommended due to how excruciatingly slow it will be, and currently LLVM cannot be compiled as a pure-capability binary (though I have done some work in the past to create a bunch of patches that make it mostly work other than one known outstanding issue), so it'd have to be built as a legacy plain AArch64 binary (but this would still be able to be run on a pure-capability CheriBSD instance since we provide a compatibility syscall interface and set of libraries, much like running 32-bit applications on 64-bit OSes today).

  • These are all things that are intended to be run natively (but, in the case of the compiler, produce binaries that run on a CHERI or Morello system). Since python-native falls into neither of those special cases, it will be installed to sdk not morello-sdk.

    What's the point in the "...-native" binaries such as ./cheribuild.py python-native or ones like ./cheribuild go or ./cheribuild ninja? Would I not just be better off using apt-get install python, apt-get install go or apt-get install ninja on my ubuntu machine? Or I have misunderstood what is meant by "native"? Also, I ran ./cheribuild.py python-native and the ~/cheri/output/sdk directory was not created, is this a bug?

    The binaries there are built from Arm's Morello-extended CHERI-LLVM and run natively on your current machine, intended to be used for cross-compiling

    So, to cross-compile code such as a simple helloworld.c program to run on a CheriBSD instance running the Morello FVP, I use the tools in ~/cheri/output/morello-sdk such as clang? I'm guessing in the command I run to cross-compile I reference the file to compile, the root filesystem and the target architecture? 

    What about for debugging? Is it recommended you cross-debug? Or run gdb that is already inside CheriBSD at /usr/local/bin?

    What is the point in running ./cheribuild.py pkg-morello-purecap when only certain binaries can be installed because of the pure capability architecture? Or is this simply for managing already installed packages?

    I also got 4 errors when installing the target pkg-morello-purecap  -  https://pastebin.com/EscixH4b - the errors are near the bottom of the console output. I'm not sure if it has actually installed properly.

    Same with python-morello-purecap: https://pastebin.com/p2MycyFa - A lot of warnings and errors near the bottom.

    Thanks again for all your help :)

  • What's the point in the "...-native" binaries such as ./cheribuild.py python-native or ones like ./cheribuild go or ./cheribuild ninja? Would I not just be better off using apt-get install python, apt-get install go or apt-get install ninja on my ubuntu machine? Or I have misunderstood what is meant by "native"? Also, I ran ./cheribuild.py python-native and the ~/cheri/output/sdk directory was not created, is this a bug?

    The -native targets exist for three reasons:

    • Some targets such as CHERI LLVM/QEMU need to be built from source and don't have distribution packages.
    • In some cases the distribution-provided packages might be too old. It is also useful if you don't have root access.
    • Being able to build and test the native version is helpful to ensure that any changes you made for CHERI support have not changed/broken the non-CHERI version

    Regarding python: The -native version here is required as a build tool for the CHERI version. It is not installed to the SDK, only in a subdirectory of the build dir. I did this to ensure that this version of python is not accidentally used by any build scripts, etc. since we don't enable all features and that might break existing scripts.

    So, to cross-compile code such as a simple helloworld.c program to run on a CheriBSD instance running the Morello FVP, I use the tools in ~/cheri/output/morello-sdk such as clang? I'm guessing in the command I run to cross-compile I reference the file to compile, the root filesystem and the target architecture? 

    As mentioned in the other thread: If you want to compile a simple C/C++ file, cheribuild installs some helper scripts in output/morello-sdk/utils/cheribsd-morello-purecap-clang(++) that will add the necessary compiler flags to select right ABI and CheriBSD sysroot (the flags are defined in output/morello-sdk/bin/cheribsd-morello-purecap.cfg).

    In my case these flags are: 

    -target aarch64-unknown-freebsd13 --sysroot=.../cheri/output/rootfs-morello-purecap -B..../cheri/output/morello-sdk/bin -march=morello+c64 -mabi=purecap -femulated-tls

    You can then place the output in the root file system and rebuild the disk image.

    What about for debugging? Is it recommended you cross-debug? Or run gdb that is already inside CheriBSD at /usr/local/bin?

    I generally use gdb inside in CheriBSD, but you can also extract the coredumps to the host and use a native gdb on them. I would recommend using GDB inside the guest since it will automatically pick up shared libraries, etc. 

    What is the point in running ./cheribuild.py pkg-morello-purecap when only certain binaries can be installed because of the pure capability architecture? Or is this simply for managing already installed packages?

    The pkg target should not be needed, it was intended as an experiment to allow pre-installing other packages into the root file system (using pkg-native to build it e.g. on a Linux host). I never tried building anything except pkg-native, and I am not sure it will work as a pure-capability program.

    Same with python-morello-purecap: https://pastebin.com/p2MycyFa - A lot of warnings and errors near the bottom.

    I last tried compiling purecap python in 2019 and to be honest I stopped working on it once I was able to run a simple hello world program.

    The error you are seeing there looks like new versions of clang have stricter -Wformat checks, I assume that problem has been fixed in upstream python and just needs to be cherry-picked back into our fork.

  • As mentioned in the other thread: If you want to compile a simple C/C++ file, cheribuild installs some helper scripts in output/morello-sdk/utils/cheribsd-morello-purecap-clang(++) that will add the necessary compiler flags to select right ABI and CheriBSD sysroot (the flags are defined in output/morello-sdk/bin/cheribsd-morello-purecap.cfg).

    Thanks, I've got a simple C program running on Cheri now :)

    I'm now looking at compiling the target nginx-morello-purecap. Once I have cross-compiled this, where would it live on the CheriBSD instance? Would I configure this like a normal Nginx web server? How would you suggest configuring the IP so that I can communicate between the Ubuntu host system and the CheriBSD VM running inside QEMU to view the Nginx front-end?

    Thanks again!

  • I'm now looking at compiling the target nginx-morello-purecap. Once I have cross-compiled this, where would it live on the CheriBSD instance?

    If you run cheribuild --pretend nginx-morello-purecap (or do the actual build and look at the messages) you can see how exactly nginx is built and installed. In particular, the configure invocation passes --prefix=/opt/morello-purecap/nginx, so that's where it will live.

    Would I configure this like a normal Nginx web server?

    Yes. Just note that we build it with the default settings, so configuration lives in /opt/morello-purecap/nginx/conf rather than overriding it to something like /etc/nginx like some distributions do.

    How would you suggest configuring the IP so that I can communicate between the Ubuntu host system and the CheriBSD VM running inside QEMU to view the Nginx front-end?

    Easiest way is to port-forward with --run-morello-purecap/extra-tcp-forwarding=$HOST_PORT:80, then it should be available on localhost:$HOST_PORT on the host.

  • If you run cheribuild --pretend nginx-morello-purecap (or do the actual build and look at the messages) you can see how exactly nginx is built and installed.

    Just done this with the --logfile option so I can have a look at the build steps in detail. In which directory is this logfile produced?

    Easiest way is to port-forward with --run-morello-purecap/extra-tcp-forwarding=$HOST_PORT:80, then it should be available on localhost:$HOST_PORT on the host.

    Brilliant I'll give this a shot! Just to confirm, would it be --run-morello-purecap/extra-tcp........          or --run-fvp-morello-purecap/extra-tcp.........

    Before then, I am currently looking at getting a very simple database configured with postgres-morello-purecap. I've made a new user (dbadmin) for this as postgres can be a bit funny when using root. I've navigated to it's directory at /opt/morello-purecap/postgres/bin and I can't seem to start the postgres service with ./postgres as I get this error message below, any ideas? There is also no entry for the postgres service in /etc/rc.d

    dbadmin@cheribsd-morello-purecap:/opt/morello-purecap/postgres/bin $ ./postgres
    postgres does not know where to find the server configuration file.
    You must specify the --config-file or -D invocation option or set the PGDATA environment variable.

    I also can't create a database with ./createdb as I am met with this error message:

    dbadmin@cheribsd-morello-purecap:/opt/morello-purecap/postgres/bin $ ./createdb test
    createdb: could not connect to database template1: could not connect to server: No such file or directory
    Is the server running locally and accepting connections on Unix domain socket "/tmp/.s.PGSQL.5432"?

    Thanks for all your help!

  • Just done this with the --logfile option so I can have a look at the build steps in detail. In which directory is this logfile produced?

    ~/cheri/build/nginx-morello-purecap-build/ will have .log files, but those are solely for the output from running configure and make. The output from cheribuild itself isn't saved anywhere.

    Brilliant I'll give this a shot! Just to confirm, would it be --run-morello-purecap/extra-tcp........          or --run-fvp-morello-purecap/extra-tcp.........

    It depends whether your using our Morello QEMU or Arm's Morello FVP.

    I've navigated to it's directory at /opt/morello-purecap/postgres/bin and I can't seem to start the postgres service with ./postgres as I get this error message below, any ideas? 

    As it says you need to tell it where your data directory is.

    There is also no entry for the postgres service in /etc/rc.d

    Upstream PostgreSQL doesn't provide an rc.d file, if you want one you'll have to copy it from the FreeBSD Ports tree at https://github.com/freebsd/freebsd-ports/blob/main/databases/postgresql14-server/files/postgresql.in or write your own. Note that this includes passing -D to address the above error.

    I also can't create a database with ./createdb as I am met with this error message:

    createdb and initdb are different things. The former is for adding new databases to a running PostgreSQL instance. The latter is for setting up the first database that a server can then be started with. See https://www.postgresql.org/docs/current/creating-cluster.html and https://www.postgresql.org/docs/current/manage-ag-createdb.html. You'll need to run initdb pointed at where you want the data directory to be.