This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Vulkan Subpass gets a higher GPU load?

Hello, everyone

I'm using Unity Engine to develop an android mobile game and I'm focusing on Vulkan API and multiple sub pass. In our case,we separate the render into 2 sub-pass, the first is the opaque pass and the second one is the transparent pass.

I'm using 3 attachment,0 for depth,1 for color0,2 for color1.In the first sub pass,the inputAttachment is null,the depthAttachment's index is 0,the colorAttachment is 1 and 2.In the second subpass,the inputAttachment index is 1 and 2 for the transparent object draw required,and the depthAttachment's index is 0,the colorAttachment is 1 and 2 which is same as first subpass.

and I have a comparable render feature that does not use multiple sub pass, that draws opaque and then store the result(color and depth) and then copy the color and depth to other textures, and then draw the transparent object, and access the copied color and depth during shading.

The result from arm mobile studio streamline is using the first method takes more $MaliGPUCyclesGPUActiv but less $MaliGPUTasksFragmentTasks compare to the second traditional method and to get High OverDraw and PixelsThroughput and the performance is worse.

I'm wondering the reason if this kind of multiple sub pass is suitable for our case because most introduction of multiple sub pass in Vulkan is for deferred rendering only.

The mobile phone is Kirin820~G57 and the method is from the G57 counter document.

Thanks all

Parents
  • Hi, 

    Depending on the algorithm, merged subpasses can be slower due to scheduling bubbles between layers (a later layer cannot progress until an earlier layer at the pixel location has written its result to tile memory). This can result in a clock-for-clock performance reduction, but still usually results in a system-wide energy efficiency improvement due to the lower memory bandwidth.

    These issues should be much improved in newer Mali GPUs such as the Mali-G710. 

    Kind regards, 
    Pete

Reply
  • Hi, 

    Depending on the algorithm, merged subpasses can be slower due to scheduling bubbles between layers (a later layer cannot progress until an earlier layer at the pixel location has written its result to tile memory). This can result in a clock-for-clock performance reduction, but still usually results in a system-wide energy efficiency improvement due to the lower memory bandwidth.

    These issues should be much improved in newer Mali GPUs such as the Mali-G710. 

    Kind regards, 
    Pete

Children