Hi,
I was pleased to find that upon updating to the Android 14 preview, VK_EXT_dynamic_state + VK 1.3 were both supported. However when updating my application to take advantage of the new vertex binding call - 'bindVertexBuffers2EXT' - making use of both the dynamic stride and limit capabilities, I found that causes severe graphical corruption. Almost as if the GPU just skips some draws entirely and lets others through just fine. I can send an RDC capture or a demo app if necessary.
HW Info:
Mali-G78
Pixel 6 Pro
Driver 38.1.0
Hey ByLaws,
Thanks for another report, it's much appreciated :) This is unfortunately indeed broken on r36p0 up-to-and-including r39 (except for r38p2) -- there are multiple scenarios where we fail to handle the pStrides argument correctly, so in practice we can only recommend to never pass anything except nullptr here...
This is fixed on r40p0 and up -- however there is a separate issue affecting vkCmdBindVertexBuffers2EXT on r40 and r41 related to calling the function multiple times. I'm clarifying the specific issue there now (i.e. if redundantly binding the same PSO is a workaround for this or not); will report back on this once I know.
Cheers,Christian
Hi again!
On the r40/r41 point -- seems like on these drivers vkCmdBindVertexBuffers2EXT will incorrectly apply the dynamic stride to all previous draws using the same PSO. In other words redundant re-bind is not a workaround, unfortunately.
In practice, best to avoid dynamic stride (pStrides != nullptr) until R42 and up.
Thank you!
I'll follow your advice of avoiding dynamic stride for <r42 then, appreciate the quick and detailed response :) Are there any other known dynamic state issues or should I be fine with just the stride workaround?
Sounds good :) There were various issues with other dynamic state as well but I believe we caught them in time. That said there's always a small non-zero chance not all patches made it to all partners before ship -- so it might be a good idea to be on the lookout for issues and test without dynamic state if you see any weird behaviours. As far as I know only vkCmdBindVertexBuffers2EXT is the only one that is definitely problematic.
Hi there,
I have found that r42 drivers disable dynamic state entirely, is this expected to be the case for future revisions as well?
On a further note, we found a bug with dynamic state 2 rasterizer discard, if it is enabled in pipeline state but the dynamic state has it disabled then draws done will still act like it is enabled and not follow the dynamic state.
Hi ByLaws,
This sounds unexpected... I'll check with the driver team. We definitely do not intent for this to be disabled; it's possible the partner here have decided to disable this for now pending bug-fixes for the known issues. Are you seeing this on a Pixel device, or something else?
I'll also check the point about about rasterizer discard. Thanks for letting us know!
Yup, we are seeing this on a pixel 7 pro running the latest beta 3 firmware. Thanks for looking into the issue :)
Thank you! Checking internally, it seems Google has indeed disabled the extensions on their side here. Our understanding is it will be re-enabled in upcoming releases, however, so this should hopefully only be a temporary issue affecting the Pixel beta-releases specifically.
For rasterizer discard, so far it sounds like this might be a new issue -- we'll investigate. Thanks again for the report :)
Hi again ByLaws,
We weren't able to reproduce this in a directed test so far; I wonder if there might be some more complicated interaction of things which cause the issue.
Is it possible to share a reproducer (APK / RenderDoc capture / vktrace/GFXReconstruct trace), or an API dump, of a case where you see the wrong state is used? I'm hoping this might help us spot where things are going wrong here.
For reference, as there is a chance the Pixel driver there is modified in some way compared to a stock driver, one of the things we tested was the very basic case of a PSO with RD = true, and RD as the only dynamic state, and this set to false. We also tried some combinations of using a different PSO with a different static state earlier, as we've seen bugs related to this in the past, but it didn't seem to apply in this case. Have you seen even a basic case like this is not working correctly on this driver? (To be clear we've not tested on the Pixel specifically.)
In short, we tried a few 'possibly problematic' cases but struggled to get a repro so far, so hoping you might have some additional info which may give some hints. For sharing repros or similar please feel free to send us an email at developer at arm.com and we can take it from there. :)
My apologies here, I hope I didn't cause you to waste too much time as after looking further into it, it seems to have been a bit of a edge case on our side.
Our backend was not properly handling the dynamic state 1 disabled (to workaround the stride bug) but dynamic state 2 enabled case thus causing the wrong number of dynamic states to be passed in at pipeline creation time leading to the issue. I'll try to be a bit more cautious reporting driver bugs next time :)
Thanks,
Billy
No worries at all! We really appreciate the reports; please do not hesitate to keep reaching out. :) We know how painful it can be to try to debug some of these things at API level -- with little-to-no insights into what happens inside the driver -- and so happy to try to help out wherever we can.