Arm Community
Arm Community
  • Site
  • User
  • Site
  • Search
  • User
  • Groups
    • Research Collaboration and Enablement
    • DesignStart
    • Education Hub
    • Innovation
    • Open Source Software and Platforms
  • Forums
    • AI and ML forum
    • Architectures and Processors forum
    • Arm Development Platforms forum
    • Arm Development Studio forum
    • Arm Virtual Hardware forum
    • Automotive forum
    • Compilers and Libraries forum
    • Graphics, Gaming, and VR forum
    • High Performance Computing (HPC) forum
    • Infrastructure Solutions forum
    • Internet of Things (IoT) forum
    • Keil forum
    • Morello Forum
    • Operating Systems forum
    • SoC Design and Simulation forum
    • 中文社区论区
  • Blogs
    • AI and ML blog
    • Announcements
    • Architectures and Processors blog
    • Automotive blog
    • Graphics, Gaming, and VR blog
    • High Performance Computing (HPC) blog
    • Infrastructure Solutions blog
    • Innovation blog
    • Internet of Things (IoT) blog
    • Operating Systems blog
    • Research Articles
    • SoC Design and Simulation blog
    • Tools, Software and IDEs blog
    • 中文社区博客
  • Support
    • Arm Support Services
    • Documentation
    • Downloads
    • Training
    • Arm Approved program
    • Arm Design Reviews
  • Community Help
  • More
  • Cancel
Arm Community blogs
Arm Community blogs
Graphics, Gaming, and VR blog Best practices to resolve typical multi-context rendering issues
  • Blogs
  • Mentions
  • Sub-Groups
  • Tags
  • Jump...
  • Cancel
More blogs in Arm Community blogs
  • AI and ML blog

  • Announcements

  • Architectures and Processors blog

  • Automotive blog

  • Embedded blog

  • Graphics, Gaming, and VR blog

  • High Performance Computing (HPC) blog

  • Infrastructure Solutions blog

  • Internet of Things (IoT) blog

  • Operating Systems blog

  • SoC Design and Simulation blog

  • Tools, Software and IDEs blog

Tell us what you think
Tags
  • Mali GPUs
Actions
  • RSS
  • More
  • Cancel
Related blog posts
Related forum threads

Best practices to resolve typical multi-context rendering issues

Fred.Li
Fred.Li
November 24, 2022
3 minute read time.

This blog provides guidance for you to resolve a rendering issue commonly seen in applications that use camera. 

What are the issues?

When doing multi-context rendering, we see a common rendering issue occurs for Mali CSF (Command Stream Frontend)-based GPUs. You often encounter this issue when doing camera or video related rendering. 

After investigation, we find that most of the issues caused by wrong behaviors in applications. 

The following figure shows you an example image with a rendering issue. The image is broken with clear edges and tile-aligned. 

Common application render flow

Take the one common application logic like the following as example:

  1. Create two shared eglContext. (Context_A, Context_B)
  2. Create shared texture (Tex_1)
  3. Create FBO_A1 in Context_A
  4. Bind Tex_1 to FBO_A1
  5. Render in Context_A to update the data for Tex_1. For example, get the image from camera, make simple modify and upload to Tex_1
  6. Bind Tex_1 in Context_B.
  7. Sampling from Tex_1 and rendering in Context_B. For example, apply different beauty filters to the image and shows on screen

The following part of this doc would show you how we analyze the issues. We use some pseudocodes and figures to explain that step by step.

Issue investigation

Simplest case without any GL sync control

The following pseudocode shows the simplest case where there are two contexts, and each focus on their own render task. However, there is not any sync control in the GL part.

Figure 1 shows the actual execution sequence in GL server side. Since GLES works in async mode, Thread-B GL commands may start execution in the GL server while Thread-A GL commands may still hold in the Command queue. Therefore, Thread-B might sample outdated data and lead to errors.

 Why Flush does not help?

Many developers would add glFlush after upload texture data to force flush out the Thread-A GL Command into GL server before waking up Thread-B.  

For traditional Mali JM (Job Manager) based GPUs, the JM receives commands from all contexts, then dispatch and execute the jobs on final hardware. Therefore, this flush operation ensures that the commands are executed as the following figure 2. The mechanism should help resolve the issue on JM GPUs.

The issues, however, still occur for Mali CSF-based GPUs. Since for CSF-based GPUs, there are multiple CSFHWIF (CSF HardWare InterFace) blocks. Each CSFHWIF block can hold one context’s command stream, and they could run all in parallel. Therefore, the sampling for Tex_1 in Thread-B and the rendering for Tex_1 in Thread-A might occur at the same time. This can cause conflicts as figure 3 shown here.

Solution

This section offers you two methods to resolve this issue:

  • Method 1: add glFinish command
  • Method 2: use EGL Fence

From the EGL Spec Version 1.5, Section 3.7.3.2. It describes in detail about the order of rendering operations between contexts. Please find the details from the EGL Spec Version 1.5. The following is a screenshot from the spec:

Now let us check how each method works with our Mali CSF GPUs.

Method 1: Add glFinish command

By changing glFlush to glFinish, even for the CSF GPU, the texture update and sample operations in sequence, shown in figure 4.

Method 2: Use EGL Fence

Even the glFinish can guarantee the execution order, but we can see that the GL operations before sample Tex1 also got delayed in Thread-B. This can cause decrease in performance.  

A better solution is to use the EGL fence to do the synchronization where needed. The following code example shows you the use of the myfence object:

Previous use of myfence allow the other render operation before sample Tex1 in Thread-B can be pulled in. As a result, both contexts can be executed in parallel as much as possible. The final execution order in the CSF GPU might be as shown in the following figure5:

Summary

Apart from the previous binding texture to Framebuffer example, the following scenarios can cause issues too: 

  • Use glTexsub* command in thread-1 to upload the texture data, and then thread-2 performs sampling on the texture. The sampled data in Thread-2 might be wrong if there is no synchronization.  
  • Read the texture in thread-1, and then reuse the same texture and update the data for it in Thread-2. The read result in thread-1 might be corrupted if there is no synchronization control.  

Modem GPUs work in asynchronous mode and GLES works as the client-server mode. When enabling the multithreading rendering, you might encounter various issues. So we must be cautious and strictly follow the Spec when designing and implementing the code.

Anonymous
  • Norman Evanson
    Offline Norman Evanson 4 months ago

    Excellent help for those debugging multithreading on modern GPUs

    • Cancel
    • Up 0 Down
    • Reply
    • More
    • Cancel
Graphics, Gaming, and VR blog
  • Arm Immortalis-G715 Developer Overview

    Peter Harris
    Peter Harris
    The new Arm®︎ Immortalis™︎ -G715 GPU is now available in consumer devices. This blog explores what is new, and how developers can get the best performance out of it.
    • March 20, 2023
  • Success in mobile games with ray tracing

    arm-phodges
    arm-phodges
    Blog provides details on how to use ray tracing techniques successfully across all mobile games on Arm-powered smartphones.
    • March 6, 2023
  • Arm at Vulkanised 2023

    Peter Harris
    Peter Harris
    A summary of the Arm talks at Khronos' Vulkanised 2023 event.
    • March 1, 2023