I create a very simple scene in UE4(empty level so only default material floor and a sky) and I notice the vulkan version has much higher input primitives than opengl (below are data in 10 seconds range)
Then I use render doc to capture the scene on both vulkan and opengl, and it turns out the primitives send to GPU are same:
The streamline says the input primitives means "The total number of input primitives to the rendering process" so I guess they should match the vertices count submitted to GPU and if we submit same number of vertices(and of course with same primitive type), we should expect they have same input primitives.
Can mali expert help to have a look at this issue? I can provide the apk, streamline files and render doc capture files if needed.
Thanks!
Yes. It's 100% reproducible.
Hi iculi,
One thing you can try here is to instead of comparing e.g. a 10 second range, zoom in to a single frame and select and compare this directly. This is usually much easier because then you don't need to worry about what the frame-rate is.
For this frame, based on RenderDoc, I see we have 2 main draws of 288 and 11904 indices. That's (11904 + 288) / 3 = 4064 primitives, and on my G77 device I see 4080 primitives total per frame (measured using the zoom-in-to-frame approach). That makes sense as there are of course a few post-processing full-screen-quad draws, as well as some UI, here, in addition to those main-pass draws.
Looking at your Redmi results you have 4.9M primitives. Is that still for a 10 seconds range? If so it could make sense given 4080 primitives * 120 fps * 10 seconds = 4.896M primitives, which matches pretty much exactly.
On my G77 device I see the same amounts of primitives per frame in both Vulkan and GLES, as one would expect. The only difference I spot is there are some more tiles/tasks rendered in Vulkan, and consequently it runs a bit slower. In my experience this is usually because the UI rendering in UE usually involves 2 render-passes in Vulkan because of a limitation in Unreal Engine / Slate. So overall this seems like expected.
If you see a significant difference in input primitives per frame on your side on G78 here, this is, in general, unexpected, I'd say. It's possible the device vendor has implemented some special optimization to cause this, however, which could possibly explain it.
Cheers,Christian
Hi Christian. The redmi results is still for 10 seconds range. I choose 10 seconds because I think the sum of range is more stable compared to a single frame. I try to zoom in to one frame, and I can see for vulkan there are many spikes which cause the high input primitives on vulkan.
These two streamlines are both captured on g78 device with same 60 fps.