AR (augmented reality) applications require better system performance than traditional applications. The human eye can easily perceive a choppy and slow representation of the physical world because we are used to it. However, even small errors in an AR application’s rendering or frame rate can cause poor user experience.
Clear and consistent application performance is especially key in the automotive space. Vehicles are increasingly incorporating augmented reality into the core user experience for example by rendering navigation information directly in the driver’s field of vision. In this scenario, a distracting lag or an unclear visualization, could result in a catastrophic accident.
This blog detailed our AR application use case and development methodology, specifically focusing on how to optimize the performance to give the best user experience.
Software, no matter how well written, is ultimately dependent on the hardware it is run on. Currently AR applications do not have a dedicated ECU (electronic control units) for augmented reality, so they usually run within existing ECUs in the vehicle. This limits performance, so the AR applications must be isolated from other applications running on the same hardware. The video content of the AR application should be displayed with stable frame rate to avoid user motion sickness and display clarity. The latency of augmented object’s appearance should stay within certain limits, which is essential for optimal AR application functionality. Even small display latencies in a moving vehicle, caused by sensor or software performance, can lead to large mismatches. In a vehicle moving at 100 km/h, a latency of 200 ms will result in up to a 5-meter misplacement of the augmented object position compared to the real world. That level of inaccuracy is unacceptable for safety-critical applications in cars.
The current ECU being used is the Samsung Exynos V9 Auto board. This board must run various other software blocks alongside the Apostera AR application and supports a broad range of performance and I/O capabilities.
The AR application needs to run in real time to be useful and provide a seamless user experience. However, most Linux variants are not real-time. The OS running on the Samsung Exynos V9 Auto board is a version of BSD Linux supplied by the manufacturer, and does not support real-time routines. The AR application must implement a real-time architecture and approach to enable algorithms to properly calculate the positions of augmented objects. This real-time requirement is satisfied by compensating for the measured latency when presenting information to the user.
With these strict requirements, it is crucial to:
A comprehensive and easy-to-use performance analyzer tool is key to deliver a quality solution.
Apostera, a software company based in Munich (Germany) develops mixed reality navigation guidance systems for car manufacturers with the aim to pave the way to better autonomous driving capabilities. Their current application can transform the windshield of a car into a mixed reality screen, where vital information is displayed for safe and effective navigation.
At Apostera, profiling the performance of AR applications usually means implementing special profiling and performance gauge libraries. The output of these libraries is recorded with performance metrics VIA a special log system (referred to as “traces” in automotive). This trace data is then extracted from the target platform either during run time or afterwards in an offline data-retrieval setup. For complex debugging and optimization cases, kernel traces are used for deeper analysis and bottleneck identification.
Working with this trace data to determine optimization routes was tedious and overall inefficient. The data cannot be analyzed in run time and many events could not be recorded over a long period. Getting an overall understanding of the system’s performance to view performance bottlenecks was a challenge.
Apostera decided to use the Arm Streamline Performance Analyzer (a component of Arm Development Studio) to optimize performance when porting their AR application to a new hardware. ‘Streamline is a complete software system profiler. It polls the system’s hardware counters to determine where the most time is spent during code execution, that is displayed in an easy to interpret GUI. The tool quickly identifies the code ‘hot-spots’ in Arm CPUs and GPUs, for example software that is taking up the most hardware resource time during execution. This allows developers to focus on optimizing the most problematic code.’
Setting up Streamline to profile their AR application was as easy. The documentation walked Apostera’s team through how to activate specific options of their Linux kernel and install the gatord daemon on the target platform.
There are two primary methods to profile an application with Streamline:
For their current case, where either type of mode would work, they decided to use the offline methodology to collect data. After setting up and recording a few AR applications runs, they transferred the profile data over to Streamline and started analyzing.
Streamline offers several different ways to visualize the recorded data
The Timeline tab shows graphs of CPU and GPU activity alongside specified PMU metrics over time.
The Functions tab displays a heatmap of all functions in the code. Here the Linux kernel was compiled without debug symbols, resulting in the unknown code messages.
The Code tab shows the time spent on each source-code line for the specified source file.
The Call Paths tab displays a heat-map of code threads and displays where the most time during execution is spent.
Apostera noted that ‘we found the call paths view particularly enlightening, showing the number of samples called in each compiling unit and subunits. We immediately identified on the timeline heat-map a code area that the application spent an inordinate amount of time in. Furthermore, linking the source code to the tool and loading the application debug information to Streamline helped us finding the exact file and line number of the problematic code. This quick problem identification and specificity simplifies our performance analysis and refactoring suggestions.’
Using the Arm Streamline tool has simplified profiling and analysis, providing a convenient overview of how Apostera’s application behaves during runtime.
Streamline can be downloaded or evaluated free for 30-days as a component of Arm Development Studio.
[CTAToken URL = "https://developer.arm.com/tools-and-software/embedded/arm-development-studio/components/streamline-performance-analyzer" target="_blank" text="Try Streamline" class ="green"]