Developing software involving DSP, control systems, and complex algorithms running on embedded targets is a humbling task. To streamline this often complex and tedious process, Arm has several offerings to help verify & optimize code under development & testing.
Arm Fast Models are 100% functionally accurate virtual platforms representing target hardware. Arm provides CPU models and peripherals such as DMAs, UARTs, and MMUs. Furthermore, additional hardware models can be connected to Arm Fast Models bringing benefits such as enhanced scalability, easy maintainability, and early availability compared to hardware.
Arm Compiler is the reference toolchain for Arm targets. With years of development in optimizing code size and performance, using Arm Compiler allows for more code to run faster in less space, important qualities for real-time embedded systems.
An advantage of working with Arm is the large ecosystem of tools and support. One notable partner is MathWorks, offering several products to help in the embedded development process:
These products enable MathWorks users to develop control systems and algorithms at a high-level, test the performance on their targets, and generate mass producible code across a wide range of applications.
Arm has teamed up with MathWorks to combine the best of Arm Fast Models and Arm Compiler with MATLAB, Simulink, and Embedded Coder. The result is two MathWorks Add-Ons that enable Arm Cortex-M-based virtual platforms to be used as targets in MATLAB and Simulink simulations, and Arm Compiler to directly compile and run Embedded Coder-generated source code on any Arm target. The benefits of this integration on the various stages of embedded software development are large and expansive. This blog will highlight the following values:
Let’s jump right in.
To take full advantage of Arm & MathWorks contribution to embedded software development, testing, and optimization the following tools are needed from MathWorks:
Simulink and Simulink Coder (the intermediate required product allowing Simulink to be used with Embedded Coder, similar to how MATLAB Coder is required for MATLAB to utilize Embedded Coder) are listed as optional as they are not required to achieve base functionality. However, most examples and developers use Simulink as it provides a useful graphical interface and user-friendly controls to develop models and run simulations. These products can be licensed through MathWorks and are downloadable from their website.
The developed support packages enabling Arm Fast Models and Arm Compiler to be used with MathWorks Embedded Coder are located both online and on the MATLAB Add-On Explorer located under Home > Add-Ons > Get Add-Ons.
Typing in ‘arm fast models’ will pull up both support packages; select one at a time and click ‘Add’ on each. This will automatically install and add the support package to your MATLAB path, and after the add process is done (after a few seconds) the ‘installed’ flag will be displayed over the package as shown below:
Finally, the Arm tools themselves must be downloaded with an active license or a free 30-day evaluation license. The documentation included in both the Arm Compiler Support Package and the Arm Cortex-M Fast Model Support Package have instructions on how to install and set up the proper Arm tools. A summary is provided here.
To simulate on a virtual Arm Cortex-M target, a more complete system is required that more closely replicates hardware boards than just the CPU. Arm helpfully provides pre-configured platforms, called Fixed Virtual Platforms (or FVPs) to quickly get up and running on a virtual target. For targets based on Cortex-M CPUs, the FVPs model the MPS2+ FPGA board made by Arm for rapid prototyping. This platform is more than sufficient in terms of peripheral capabilities and memory space to act as a target for embedded software under development. If more customization is required, obtain an Arm Fast Models license to develop and customize your own custom virtual platform. This blog will explain how to install and use Arm Cortex-M FVPs. If a customized virtual platform is desired read the documentation included in the support package for instructions on how to go about obtaining that capability.
The simplest way to obtain Arm Cortex-M FVPs, Arm Compiler, and a 30-day license for both is to download the Arm debugger, DS-5. Follow the instructions here through the ‘Obtaining a License’ step, then set the following environmental variables in your system to enable MATLAB & Simulink to use these tools (replacing ‘DS-5 version’ with the correct path):
Make sure you point to the .dat file, regardless of its name, wherever it is located. It may not have been generated in that location so be sure to check its location before setting this environmental variable (if it isn't already set).
In MATLAB it is also necessary to point to the correct compiler path for Arm Compiler 5 and 6. These are set as preferences, which act as environmental variables in MATLAB but are persistent when closing the application. This is done by typing the following into the MATLAB Command Window, fixing the path for your system by replacing the DS-5 and Arm Compiler versions:
setpref('ArmCompiler','AC6_Path','C:\Program Files\DS-5 v5.28.0\sw\ARMCompiler6.9\bin')
setpref('ArmCompiler','AC5_Path',' C:\Program Files\DS-5 v5.28.0\sw\ARMCompiler5.06u6\bin’)
With the system properly configured the next step is to run some example systems.
First up is a Processor-In-the-Loop, or PIL, simulation. In a PIL simulation, the compiled object code from the desired control system/algorithm is tested by running the object code on a real target--or a virtual one. These simulations represent an important point in embedded software development by verifying that the generated object code running on an embedded target is numerically equivalent to host-based simulations. Verifying this fact indicates several things are correctly configured, including the toolchain flags--involving the compiler, linker, and assembler--and floating-point handling.
To bring up the example PIL simulation with this support package, type the following into the MATLAB Command Window:
This will open the example PIL control system, with 3 separate loops all representing the same underdamped system implemented differently.
In each loop there is a step function input into the controller, which is a reference model in the top two loops called ‘ArmCortexM_FM_pil_controller’, and a plant representing some external system’s behavior. This ‘ArmCortexM_FM_pil_controller’ reference model is the algorithm that will be simulated either on the host OS or on an embedded target through PIL testing. This is the reference model, seen by double clicking on the ‘PIL1’ block in the ‘ArmCortexM_FM_pil_top’ model:
Looking at the ‘ArmCortexM_FM_pil_top’ model, the three loops are described as follows:
As the ‘ArmCortexM_FM_pil_controller’ model is selected as our PIL-targeted model, select that model and view its Configuration Parameters. Note that changing the configuration parameters of the controller model is what affects the PIL simulation. Navigate to Configuration Parameters > Hardware Implementation > Target hardware resources > Device:
This example is pre-configured to use an Arm Cortex-M4 Fixed Virtual Platform (FVP) target located at the listed Path. Ensure this path is correct for your system before continuing. Leave all other settings the same here and navigate to the ‘Code Generation’ pane. Observe that this example is set to use Arm Compiler 6. Close this pane and return to the ‘ArmCortexM_FM_pil_top’ model.
To run the simulation, select the green play button on the top bar of the model (or use the hot-key CNTL + t). This will take some time to convert the algorithm to C code, compile that C code to object code, load that object code onto a started Arm Cortex-M4 FVP process, and run the simulation. When the simulation is completed you will hear a satisfying Windows sound and the graph on the right will populate like so (with different colors):
This graph shows the following:
No red is visible on the graph because the PIL simulation is identical to the host simulation, verifying numerical equivalency. This important result requires no hardware due to Arm Fast Model’s 100% functional accuracy.
In addition to this numerical equivalency verification, several other valuable pieces of information can be easily obtained in different views. After the simulation completed, a pop-up with the title ‘Profiling: ArmCortexM_FM_pil_top/PIL1’ appeared and the PIL1 reference model shaded blue. Clicking on the blue shaded model will populate the execution-time code profiling metrics from the PIL simulation:
The ‘ArmCortexM_FM_pil_controller’ shows the maximum and average time spent in that algorithm, and how many times it was called during the simulation. As this was a 10 second simulation with a 0.1 second step time, it makes sense that it was called 101 times; once for every processing step. Important to note is that these maximum and average values will be consistent but not representative of hardware timing numbers as Arm Fast Models are not cycle accurate, but functionally accurate. This means that it is possible to use these profile numbers to understand relative algorithm performance and therefore possibly tune these algorithms for better performance, but these numbers should not be expected to correlate or represent the real hardware profiling numbers.
Another piece of information available is the size of the algorithm in bytes downloaded onto the embedded target. View this by selecting the ‘View diagnostics’ label at the bottom of the Simulink window and scrolling up until the code size report is displayed:
To learn more about how the object code sizes displayed can be mapped to different memory locations see the Arm Compiler documentation. Using these numbers, it is possible to understand if the algorithm under development will fit on your hardware, and even to iterate on toolchain optimizations to obtain better performance and code size.
To tune algorithm parameters a different type of simulation technique is used, named external mode. In external mode simulations the target (virtual or hardware) runs the specified algorithm / system in real-time. This is opposed to PIL simulations which are constantly waiting for the next set of input data to calculate the outputs in a blocking type of implementation. External mode is therefore closer to a real-world system, with the benefit of being able to change parameters in the algorithm during the simulation and viewing the results from the real-time application change as well. This is useful for rapid prototyping and parameter tuning applications.
Bring up the example external mode simulation with this support package by typing the following into the MATLAB Command Window:
This will bring up the external mode example shown:
This algorithm is straight-forward: There are two options for inputs, a sin wave and a random noise generator, both with discrete step sizes of 0.1 second. A switch selects which input is passed onwards into a variable gain block, which then outputs to a graph and an output block for additional potential analysis.
Open the Configuration Parameters of this example and verify the following:
This example is also being built by the Arm Compiler 6 toolchain.
Once all preferences are set for your system, hit the green run button at the top. Note that this simulation time is set to ‘inf’ meaning infinite, only stopping when the pause button is selected. Simulink will then take some time to compile, load, and start running the application on the FVP. When it is running you will see a time increasing in the bottom left of the Simulink window as well as the FVP being opened and time elapsing on the virtual system. Double-click the Simulink ‘Scope’ block to view the output in real-time being generated. How fast the target is running depends on many factors including the speed of the host computer.
This scope show a simulation that is switched between several settings over 8.5 seconds, starting with the switch enabling the sine wave with an amplitude of 2 to pass through the gain block of 1. The settings that are changed are as follows:
With this example it is easy to see how useful this feature can be when tuning a wide range of basic to complex algorithms, including control systems in oil refineries, fly-by-wire aircraft, trains, and cars.
Included in the Arm Cortex-M Fast Model Support Package is a regression test framework to jump start using Arm Fast Models in regression testing. Extensive & frequent testing is critical to establish high code-quality, minimize code bugs, and to catch errors early before they turn into large & expensive issues. Further, regression testing is an essential way to deliver the metrics and meet the high standards to create a safety certified product. This is an important tool in the embedded software developer’s toolbox, but unfortunately it typically comes with major hassles. Many tests require many targets and maintaining & scaling lots of hardware boards can be expensive and tedious. Furthermore, the limitations of the number of hardware boards restricts the amount of parallelization possible, making test times sometimes unworkably long.
Virtual platforms, however, avoid these pitfalls. Scaling up targets means obtaining more licenses. Maintaining Arm Fast Models is not necessary as it is software not hardware. The ability to parallelize tests only stops with the number of virtual platforms you want to run at the same time. To show the benefits of parallelizing virtual platforms in regression testing, utilize the provided regression test example along with the Parallel Computing toolbox from MathWorks. To enable parallelization with the provided scripts, type ‘edit run_tests_script.m’ in the MATLAB Command Window and replace the line ‘results = run(runner,currTest);’ with ‘results = runInParallel(runner,currTest);’. The rest of the script is the same.
Type ‘edit virtualPlatformTest.m’ into the MATLAB Command Window to view the unit test structure. This test structure is based off the MATLAB unittest class-based framework. For detailed info of how this test flow operates see the documentation for unittest.
The first thing to note are the parameters ‘ModelNames’, ‘TargetHardwares’, ‘Platforms’, and ‘Toolchains’. The test function will take in every combination of the listed parameters. For example, the first test will run with the following parameters:
The second test will run with the following parameters:
And so on until all the combinations are finished, a total of 18 simulations by default (6 different virtual platforms and 3 different toolchains). In a typical testing framework, before each unit test is a setup function that initializes variables and workspaces. In short, a unit test here simulates an example model, ‘PIL1_top’ which is identical to the PIL system earlier in this blog, both on the host OS and then on the specified virtual platform target. If the results are identical the test passes. If the outputs are different or any other error occurs during the testing process such as a compilation error, simulation error or otherwise, the test fails. An excel file is generated at the end of all the tests summarizing the results and the time passed during each test. Before running the test yourself, ensure the hardcoded path locations are correct pointing to the virtual platform executables, farther down the virtualPlatformTest.m file. Run the test by typing the following into the MATLAB Command Window:
This will kick off all 18 tests, with the text output for each test displayed in the Command Window as well. After some time, the tests will complete and summary ‘myTestResults.xml’ file like below is generated in the MATLAB current working directory:
Here there were 3 failures resulting from all three compilers attempting to run on the Arm Cortex-M1 Fast Model. That is because there is no Cortex-M1 virtual platform available from DS-5! It is an FPGA optimized core and is available in separate packages. All other tests ran successfully in varying amounts of time.
To speed up the total test run time, parallelize the test executions with the Parallel Toolbox as mentioned above. With this framework in place it is very easy to start running parallel tests on virtual platforms and thus drastically reducing total test time, as well as making the safety verification process much more streamlined and efficient. Alter this fast and scalable regression test example to your specific use-case and start testing today!
By leveraging the Arm Cortex-M Fast Model Support Package for Embedded Coder and the Arm Compiler Support Package for Embedded Coder, essential values during the embedded software development and testing phases are unlocked. Benefits range from simplified regression testing to toolchain verification and optimization with the packages supporting what you need to accomplish in your algorithm / control system development. Applicable to a wide range of industries, these two support packages assist across the software development lifecycle, from specification definition to coding through testing.
For a realistic example of how to use these packages to efficiently develop software for DSP applications see the joint MathWorks-Arm webinar on the subject, as well as the content on Arm Developer. Download the support packages today to get started.
Arm Cortex-M Fast Model support packageArm Compiler support package
Update: Arm DS is now recommended over DS-5. The install steps are almost identical, with the main differences being in licensing. Please see this community post to resolve licensing issues: https://community.arm.com/developer/tools-software/tools/f/arm-compilers-forum/46400/armclang-error-in-simulink-when-using-fvp-target/