Bare metal software is used for benchmarking, developing software algorithms, comparing different compilers, and developing startup code. Arm DS-5 comes with many examples of bare metal software, especially for CPU initialization. Arm Cycle Models also utilize bare metal software to evaluate CPU, interconnect, and memory options when creating new hardware designs. Some embedded products are developed exclusively using bare metal software or use bare metal software for hardware diagnostics. The Raspberry Pi 3 is a good board to use for some of the previously mentioned tasks. It has the Arm Cortex-A53, which has widespread use in embedded systems, and a flexible boot procedure which allows bare metal software to be run. It’s also an effective way to get familiar with DS-5 and DSTREAM-ST to practice for future projects.
This blog explains how to use Arm DS-5 and DSTREAM-ST to run and debug AArch64 bare metal software, compiled with Arm Compiler 6, on the Raspberry Pi 3.
A good starting point is to prepare an SD card with the latest version of Raspbian. Follow the instructions to create the SD card using whatever method is easiest. I use Linux and dd to copy the .img file to the SD card but there are many other options. Even though we are not interested in running Raspbian, some of the files on the SD card are used to setup the board before the bare metal software is executed.
There are various bare metal resources for Raspberry Pi. One reference I used is a presentation from Fosdem 2017. I’m sure you can find others by searching.
Running bare metal software is as simple as editing a configuration file and copying the bare metal software image to the SD card.
To change the configuration to be bare metal, copy the file config.txt to config.txt.org on the first partition of the SD card. The first partition is a FAT partition which should automatically mount on Windows. If it doesn’t automount on Linux, mount it using something like:
$ sudo mount /dev/mmcblk0p1 /mnt
Edit the new config.txt to have just the lines below:
The first value means to boot in AArch64. This replaces the deprecated arm_control=0x200. Setting kernel_old specifies to start from address 0. The line to enable JTAG is needed to use DSTREAM-ST. Without it, the JTAG will not work. The final value just skips the kernel command line loading since it’s not needed for bare metal. More information about the config.txt file can be found in the documentation.
Once the configuration file is changed, copy the bare metal software to the SD card. The bare metal file must be named kernel7.img. This replaces the current Linux image that is on the card. Save the original if you want to return to a regular Linux boot later. Attached to the bottom of this article is a sort example which is taken from the DS-5 examples. It has been changed slightly for the Raspberry Pi 3. If you just want to run immediately copy the sorts.bin file to the first partition on the SD card with the file name kernel7.img
$ sudo cp sorts.bin /mnt/kernel7.img
Booting the Pi with this file will run the sorts bare metal application, but unfortunately you won’t see anything because there is no way to get any output from the program. Let’s see how to connect a UART so we can see some output from the bare metal application.
The Raspberry Pi 3 has a UART which can be used to print messages from a bare metal program. The UART is helpful when running without DSTREAM-ST as semihosting cannot be used without the DS-5 debugger connection. The UART pins are on the 40 pin GPIO header of the Pi. I used the Adafruit USB to Serial cable to connect to the Raspberry Pi. The red wire is not used for the connection, only the black, white, and green.
Once the cable is connected, software like minicom on Linux or putty on Windows can be used to display the UART output. More details on different software options can be found on the Embedded Linux Wiki.
To use minicom on Linux:
$ sudo minicom -b 115200 -o -D /dev/ttyUSB0
Now, if the bare metal application is run, the output will be printed in the minicom terminal.
Use Ctrl-A then x to exit minicom.
So far it looks easy to run the attached bare metal sort software, but I must admit it would have been difficult to build a working application without a debugger. Even with the debugger it took several tries to understand the configuration options, boot process, and MMU settings required to produce a working application. Let’s see how to connect DSTREAM-ST and use it for debugging bare metal applications.
Connecting DSTEAM-ST to the Raspberry Pi 3 is done using jumper wires from the DSTREAM-ST JTAG 20 connector to the 40 pin GPIO header on the Raspberry Pi 3. The DSTREAM-ST connector is shown vertically in the documentation so imagine rotating it clockwise to be horizontal, with the notch on top. This is how it sits on the desk.
Pinout also has a JTAG page which highlights the pins to use for the DSTREAM-ST connection. I used the Alt4 pinout for JTAG by connecting the pins listed in the table below:
With both the DSTREAM-ST and the UART cable connected the Raspberry Pi 3 looks like this:
The attached software has two different applications. The first one, sorts-semi.axf uses semi-hosting for the printf() statements. This is the easiest type of software to get working using DS-5 and DSTREAM-ST. It doesn’t require the UART to be connected and programmed. The .axf file can simply be downloaded via DS-5 and executed.
To get started open DS-5 with a fresh workspace and create a new project using File -> New -> Makefile Project with Existing Code. Use the Browse button to navigate to the directory with the extracted software attached to this article. Select Arm Compiler 6 as the toolchain and hit the Finish button.
Use the Build and Clean functionality of DS-5 as normal to compile the software.
To connect and debug the sorts-semi.axf application, use Run -> Debug Configuration and select DS-5 Debugger and press the New button to create a Debug Configuration. Enter a name for the configuration and search for the Raspberry Pi 3 in the Filter Platforms box. Only 1 CPU does anything meaningful in this software so only the first Cortex-A53 needs to be selected.
On the Files tab select the sorts-semi.axf from the workspace, and on the Debugger tab select Debug from entry point.
Make sure the DSTREAM-ST is visible on the Connection tab and click Apply and Debug.
Debugging can proceed as normal. The semi-hosted printf() messages will appear in the App Console.
The second application, sorts.axf, uses the UART instead of semi-hosting. To run this application, simply change the sorts-semi.axf file in the Files tab to sorts.axf. A UART setup function is included to initialize the UART. During execution the characters are sent to the UART in the file retarget.c. Instead of appearing on the App Console, the messages will appear in the terminal connected to the Raspberry Pi 3.
Both of these methods are very easy because the DS-5 connection takes care of downloading the .axf files into memory and starts the debugging. There is no need to do anything with the SD card. This avoids copying the software to the SD card for each software change.
DS-5 offers an MMU viewer which helped when it came time to program the UART. The window is not visible in the default layout, but can be activated using Window -> Show View -> MMU/MPU. It will appear near the Disassembly and Memory windows. The Memory Map tab was helpful to confirm the UART address was setup as Device memory. It took some changes to the MMU programming to configure this correctly and get the UART to work. DS-5 can be used to manually change registers and then look back at the Memory Map and see if the desired settings are achieved. This saves time compared to modifying the source code and running again.
After debugging both applications using DS-5, the last step is to confirm the sorts.axf application works without the debugger and prints to the UART. This is done by replacing kernel7.img with sorts.bin on the SD card. The file must be a binary file, not an ELF file. Look at the Makefile to see how the binary file is generated with the fromelf utility from Arm Compiler 6.
DS-5 was an immense help in reaching the goal of running bare metal software created by Arm Compiler 6 from the SD card. Starting with semi-hosting, then moving to a UART application downloaded into RAM, and then finally putting the application on the SD card it was much easier than blindly running and waiting for messages on the UART.
This introduction provides a short cut to learning how to use Arm DS-5 for bare metal software development on the Raspberry Pi 3. Connecting to the right pins on the Raspberry Pi is critical as is making sure JTAG is enabled in the config.txt file. Having a Cortex-A53 board to use to run bare metal has many uses. Next steps may be programming the Performance Monitor Unit (PMU) to read various counters, doing compiler comparisons, running a multi-core bare metal program or benchmark, or running an RTOS.
To learn more about DS-5 or to try out the examples from this article, use the link below for a free 30-day trial.
Get started with the DS-5 Development Studio
DSTREAM and DS-5 (or the new Arm Development Studio) target the use case of Linux kernel and device driver development. You can also (simultaneously) perform application debug via a gdbserver connection inside the same GUI, but with the same level of functionality that gdb offers. Using ETM trace would not be meaningful for the application space as there would be other tasks running intermittently.
The best solution would be to use the Streamline performance analyzer, another feature of DS-5/Development Studio, which can generate a real time profile of everything executing on the target. I appreciate your comment regarding such profilers affecting the performance of the application itself, but Streamline requires no annotation of your code, and uses the CPU performance counters (PMUs) to generate its statistics, and so is a very low overhead to the overall system.
I would invite you to try out the latest version of Streamline, and Development Studio overall, with a free 30-day fully featured evaluation license:
Thanks for this very useful post. I need to trace a user-space application running on Raspbian, not run a bare-metal program on Raspbian. Can I use DS-5 debugger for this? I can make the same hardware connections between the target (Raspberry PI 3B+), the host (Ubuntu desktop/laptop) through the DSTREAM-ST probe, as you show in your diagram. I will power up the Raspberry PI which boots Raspbian from the SD card. Is it possible to launch an application program on the Raspberry PI from DS-5 running on the host laptop, after Raspbian has booted? I do not want to run the DS-5 on the Raspberry PI itself to trace user space application, as that will affect the performance of the application. I want remote tracing with a host and a DTSTREAM probe. Is that feasible?
I do not know yet. I am still a beginner for these complex tasks!!
The USB00..22 is the DSTREAM-ST connected to my laptop with a USB cable. This is how the connection is made from the DS-5 debugger to the DSTREAM connected to the R Pi.
Are you most interested in running bare-metal software on the Raspberry Pi 4 or having the debugger connection to the Raspberry Pi 4?
Thanks !! I am going to try that too. in my case, the Debug option is not enabled. Maybe because my connection is empty. What is USB000..22? Is it UART?