This blog provides guidance for you to resolve a rendering issue commonly seen in applications that use camera.
When doing multi-context rendering, we see a common rendering issue occurs for Mali CSF (Command Stream Frontend)-based GPUs. You often encounter this issue when doing camera or video related rendering.
After investigation, we find that most of the issues caused by wrong behaviors in applications.
The following figure shows you an example image with a rendering issue. The image is broken with clear edges and tile-aligned.
Take the one common application logic like the following as example:
The following part of this doc would show you how we analyze the issues. We use some pseudocodes and figures to explain that step by step.
The following pseudocode shows the simplest case where there are two contexts, and each focus on their own render task. However, there is not any sync control in the GL part.
Figure 1 shows the actual execution sequence in GL server side. Since GLES works in async mode, Thread-B GL commands may start execution in the GL server while Thread-A GL commands may still hold in the Command queue. Therefore, Thread-B might sample outdated data and lead to errors.
Many developers would add glFlush after upload texture data to force flush out the Thread-A GL Command into GL server before waking up Thread-B.
For traditional Mali JM (Job Manager) based GPUs, the JM receives commands from all contexts, then dispatch and execute the jobs on final hardware. Therefore, this flush operation ensures that the commands are executed as the following figure 2. The mechanism should help resolve the issue on JM GPUs.
The issues, however, still occur for Mali CSF-based GPUs. Since for CSF-based GPUs, there are multiple CSFHWIF (CSF HardWare InterFace) blocks. Each CSFHWIF block can hold one context’s command stream, and they could run all in parallel. Therefore, the sampling for Tex_1 in Thread-B and the rendering for Tex_1 in Thread-A might occur at the same time. This can cause conflicts as figure 3 shown here.
This section offers you two methods to resolve this issue:
From the EGL Spec Version 1.5, Section 3.7.3.2. It describes in detail about the order of rendering operations between contexts. Please find the details from the EGL Spec Version 1.5. The following is a screenshot from the spec:
Now let us check how each method works with our Mali CSF GPUs.
By changing glFlush to glFinish, even for the CSF GPU, the texture update and sample operations in sequence, shown in figure 4.
Even the glFinish can guarantee the execution order, but we can see that the GL operations before sample Tex1 also got delayed in Thread-B. This can cause decrease in performance.
A better solution is to use the EGL fence to do the synchronization where needed. The following code example shows you the use of the myfence object:
Previous use of myfence allow the other render operation before sample Tex1 in Thread-B can be pulled in. As a result, both contexts can be executed in parallel as much as possible. The final execution order in the CSF GPU might be as shown in the following figure5:
Apart from the previous binding texture to Framebuffer example, the following scenarios can cause issues too:
Modem GPUs work in asynchronous mode and GLES works as the client-server mode. When enabling the multithreading rendering, you might encounter various issues. So we must be cautious and strictly follow the Spec when designing and implementing the code.
Excellent help for those debugging multithreading on modern GPUs