This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Stall in Image Store with Framebuffer Fetch

Hi

I used Samsung S9 (Mali-G72, OpenGL ES 3.2 r9p0).

I Implement a method to copy current pixel to a image.

I use Framebuffer Fetch to read pixel, then I use Image Store (not atomic) to a image.

And It will process a Dual Filter (Down-Up Sample) to do Bloom Texture in next frame.

But I used the method in OpenGL ES, It dropped 30fps

I see CPU is waiting for something. and GPU is slowdown.

 

Using the method in Vulkan is not problem.

It can see low usage on Mali L2 Cache Stall in Streamline.

So how to know what the problem ? (no problem in other vendor too)

Thanks

Parents
  • Using imageLoad/Store is really designed for cases where you need read-modify-write access to images, and will disable many optimizations that you would get for free by doing a write out via the framebuffer (such as framebuffer compression). For high core count configurations such as the Galaxy S8 and S9 (20 and 18 cores respectively), it is very easy to become bottlenecked on main memory if you have all cores touching memory regularly, so the loss of framebuffer compression is likely going to be painful.

    Framebuffer fetch also needs to be used with some care because it can cause dependency stalls on the pixel pipeline (a fragment in a later layer must wait for the earlier layer to commit a result to tile memory before it can be read back again - if too many threads stall then that can be expensive).

    Without knowing exactly what you are trying it's hard to give specific advice - are you able to share a reproducer APK and/or your Streamline files?

    Regards, 
    Pete

Reply
  • Using imageLoad/Store is really designed for cases where you need read-modify-write access to images, and will disable many optimizations that you would get for free by doing a write out via the framebuffer (such as framebuffer compression). For high core count configurations such as the Galaxy S8 and S9 (20 and 18 cores respectively), it is very easy to become bottlenecked on main memory if you have all cores touching memory regularly, so the loss of framebuffer compression is likely going to be painful.

    Framebuffer fetch also needs to be used with some care because it can cause dependency stalls on the pixel pipeline (a fragment in a later layer must wait for the earlier layer to commit a result to tile memory before it can be read back again - if too many threads stall then that can be expensive).

    Without knowing exactly what you are trying it's hard to give specific advice - are you able to share a reproducer APK and/or your Streamline files?

    Regards, 
    Pete

Children