When testing my applications on Android 15 with a Mali-G710 MP7 (Pixel 7), I noticed that any time I submit a secondary command buffer to a render pass I get the following validation error when presenting the swap chain:
VUID-VkPresentInfoKHR-pImageIndices-01430: Validation Error: [ VUID-VkPresentInfoKHR-pImageIndices-01430 ] Object 0: handle = 0xb400007be5014a90, type = VK_OBJECT_TYPE_QUEUE; | MessageID = 0x48ad24c6 | vkQueuePresentKHR(): pPresentInfo->pSwapchains[0] images passed to present must be in layout VK_IMAGE_LAYOUT_PRESENT_SRC_KHR or VK_IMAGE_LAYOUT_SHARED_PRESENT_KHR but is in VK_IMAGE_LAYOUT_UNDEFINED.The Vulkan spec states: Each element of pImageIndices must be the index of a presentable image acquired from the swapchain specified by the corresponding element of the pSwapchains array, and the presented image subresource must be in the VK_IMAGE_LAYOUT_PRESENT_SRC_KHR or VK_IMAGE_LAYOUT_SHARED_PRESENT_KHR layout at the time the operation is executed on a VkDevice (docs.vulkan.org/.../wsi.html
After a few frames, the device is lost. I haven't encountered either issue on any other platform when running the same tester going through the same Vulkan code paths.
Here is a screenshot from RenderDoc when running the same tester on Linux with the same configuration to draw with a secondary command buffer:
I have highlighted the barriers that perform the layout transitions. Here are the contents of the barriers themselves:
Some other things to note:
When debugging, I even tried waiting for the GPU to be idle before processing the submit, and I still had the same behavior. The following can avoid both the validation and device lost issues:
So far I haven't been able to find the source of the error for the image transitions, command buffer setup, or queue submission and swapchain presentation. I see that Mali-G710 was advertised as the first Mali GPU with native secondary command buffer support. Are there extra steps that I'm missing when the hardware support is used for Mali, such as extra barriers? Could this be a driver issue, perhaps android-specific with the swapchains?
The project is open source, I can share it and instructions to reproduce if need be.
I remembered that RenderDoc fails if validations are enabled and my automatic detection doesn't work on Android. After manually disabling validations (by making sure the enableValidation() function on line 329 in modules/Render/RenderVulkan/src/VkInit.c returns false), I was able to get a frame trace in RenderDoc.
The frame trace in RenderDoc in the crashing frame on Android looks identical to the one I posted above from desktop Linux.The requisite memory barriers for performing the image layout transitions for the swapchain image are present, and the image instance matches for both the barriers and the call to present at the end. The RenderDoc process running on the device also ends up crashing due to device lost when trying to replay the trace.
I realize that validations run at a level above the driver, but does it query anything from the state for the underlying Vulkan implementation? The frame trace seems to show that I am already fulfilling the requirements for the validation error, and the the Mali device is the only one showing the validation errors and device lost, which makes me think that something beneath the surface is causing the failure.