Hi Forum,
This is in follow up with my previous query about timing drawcall: https://community.arm.com/support-forums/f/graphics-gaming-and-vr-forum/54832/vulkan-app-drawcall-timing/
I experimented with SSAO application, https://github.com/SaschaWillems/Vulkan/tree/master/examples/ssao and trying to understand the timing at renderpass level.
Attached is the timing which shows 4 renderpasses in which the ssao shader is invoked in renderpass 2. The timing of this renderpass 2 looks reasonable.
What I could not understand is why the timing of other render passes is so high. And this seems to be the case for all renderpass with more that 1 drawcall, the timing seems way off.
Could this be a bug in renderdoc or am I missing anything here?
Thank you,
EID | Event | Action # | Duration (Microseconds) ----------------------------------------------------------------------------------------------------------- 1 | - vkDeviceWaitIdle(Device 9) | | 2 | - vkDeviceWaitIdle(Device 9) | | 3 | - vkUnmapMemory(Memory 983, (200 bytes)) | | 4 | - vkUnmapMemory(Memory 986, (76 bytes)) | | 5 | - vkQueueWaitIdle(Queue 124) | | 6 | - vkFlushMappedMemoryRanges(Memory 2530, (11200 bytes)) | | 7 | - vkFlushMappedMemoryRanges(Memory 2533, (1860 bytes)) | | 8 | - Coherent Mapped Memory Write(Memory 2530, (11200 bytes)) | | 9 | - Coherent Mapped Memory Write(Memory 2533, (1860 bytes)) | | 10 | - vkQueueSubmit({ { Command Buffer 4845 } }) | | 11 | - => vkQueueSubmit(1)[0]: vkBeginCommandBuffer(Baked Command Buffer 4863) | 1 | 104 | - Colour Pass #1 (3 Targets + Depth) | 1-26 | 97961.00448 12 | \- vkCmdBeginRenderPass(C=Clear, D=Clear, S=Don't Care) | 1 | 13 | - vkCmdSetViewport(0, { { 0.00, 0.00, 2696.00, 1260.00 } }) | | 14 | - vkCmdSetScissor(0, { { { 0, 0 }, { 2696, 1260 } } }) | | 15 | - vkCmdBindPipeline(Graphics Pipeline 1025) | | 16 | - vkCmdBindDescriptorSets(0, { Descriptor Set 1006 }) | | 17 | - vkCmdBindVertexBuffers(0, { Buffer 900 }) | | 18 | - vkCmdBindIndexBuffer(Buffer 903, VK_INDEX_TYPE_UINT32) | | 19 | - vkCmdBindDescriptorSets(1, { Descriptor Set 926 }) | | 20 | - vkCmdDrawIndexed(2388, 1) | 1 | 4347.61558 21 | - vkCmdBindDescriptorSets(1, { Descriptor Set 927 }) | | 22 | - vkCmdDrawIndexed(30504, 1) | 2 | 4266.84635 23 | - vkCmdBindDescriptorSets(1, { Descriptor Set 928 }) | | 24 | - vkCmdDrawIndexed(17628, 1) | 3 | 4264.38481 25 | - vkCmdBindDescriptorSets(1, { Descriptor Set 929 }) | | 26 | - vkCmdDrawIndexed(8448, 1) | 4 | 4219.38481 27 | - vkCmdBindDescriptorSets(1, { Descriptor Set 930 }) | | 28 | - vkCmdDrawIndexed(63, 1) | 5 | 4218.15404 29 | - vkCmdBindDescriptorSets(1, { Descriptor Set 931 }) | | 30 | - vkCmdDrawIndexed(21264, 1) | 6 | 4161.84634 31 | - vkCmdBindDescriptorSets(1, { Descriptor Set 932 }) | | 32 | - vkCmdDrawIndexed(2640, 1) | 7 | 4161.46173 33 | - vkCmdBindDescriptorSets(1, { Descriptor Set 933 }) | | 34 | - vkCmdDrawIndexed(69624, 1) | 8 | 4063.61557 35 | - vkCmdBindDescriptorSets(1, { Descriptor Set 934 }) | | 36 | - vkCmdDrawIndexed(54, 1) | 9 | 4062.76942 37 | - vkCmdBindDescriptorSets(1, { Descriptor Set 935 }) | | 38 | - vkCmdDrawIndexed(53064, 1) | 10 | 4018.53865 39 | - vkCmdBindDescriptorSets(1, { Descriptor Set 936 }) | | 40 | - vkCmdDrawIndexed(10416, 1) | 11 | 4017.53865 41 | - vkCmdBindDescriptorSets(1, { Descriptor Set 937 }) | | 42 | - vkCmdDrawIndexed(12258, 1) | 12 | 3991.07711 43 | - vkCmdBindDescriptorSets(1, { Descriptor Set 938 }) | | 44 | - vkCmdDrawIndexed(9126, 1) | 13 | 3990.3848 45 | - vkCmdBindDescriptorSets(1, { Descriptor Set 939 }) | | 46 | - vkCmdDrawIndexed(94308, 1) | 14 | 3870.69248 47 | - vkCmdBindDescriptorSets(1, { Descriptor Set 940 }) | | 48 | - vkCmdDrawIndexed(49536, 1) | 15 | 3868.30787 49 | - vkCmdBindDescriptorSets(1, { Descriptor Set 941 }) | | 50 | - vkCmdDrawIndexed(43008, 1) | 16 | 3777.23094 51 | - vkCmdBindDescriptorSets(1, { Descriptor Set 942 }) | | 52 | - vkCmdDrawIndexed(49536, 1) | 17 | 3776.00017 53 | - vkCmdBindDescriptorSets(1, { Descriptor Set 943 }) | | 54 | - vkCmdDrawIndexed(43008, 1) | 18 | 3698.15402 55 | - vkCmdBindDescriptorSets(1, { Descriptor Set 944 }) | | 56 | - vkCmdDrawIndexed(33024, 1) | 19 | 3697.30786 57 | - vkCmdBindDescriptorSets(1, { Descriptor Set 945 }) | | 58 | - vkCmdDrawIndexed(56832, 1) | 20 | 3629.4617 59 | - vkCmdBindDescriptorSets(1, { Descriptor Set 946 }) | | 60 | - vkCmdDrawIndexed(59484, 1) | 21 | 3628.61555 61 | - vkCmdBindDescriptorSets(1, { Descriptor Set 947 }) | | 62 | - vkCmdDrawIndexed(96, 1) | 22 | 3586.69247 63 | - vkCmdBindDescriptorSets(1, { Descriptor Set 948 }) | | 64 | - vkCmdDrawIndexed(27552, 1) | 23 | 3586.07709 65 | - vkCmdBindDescriptorSets(1, { Descriptor Set 949 }) | | 66 | - vkCmdDrawIndexed(49488, 1) | 24 | 3529.92324 67 | - vkCmdBindDescriptorSets(1, { Descriptor Set 950 }) | | 68 | - vkCmdDrawIndexed(43452, 1) | 25 | 3528.92324 69 | - vkCmdEndRenderPass(C=Store, D=Store, S=Don't Care) | 26 | 105 | - Colour Pass #2 (1 Targets) | 26-27 | 34781.92467 70 | \- vkCmdBeginRenderPass(Clear) | 26 | 71 | - vkCmdSetViewport(0, { { 0.00, 0.00, 1348.00, 630.00 } }) | | 72 | - vkCmdSetScissor(0, { { { 0, 0 }, { 1348, 630 } } }) | | 73 | - vkCmdBindDescriptorSets(0, { Descriptor Set 1009 }) | | 74 | - vkCmdBindPipeline(Graphics Pipeline 1020) | | 75 | - vkCmdDraw(3, 1) | 26 | 34781.92467 76 | - vkCmdEndRenderPass(Store) | 27 | 106 | - Colour Pass #3 (1 Targets) | 27-28 | 1208.92313 77 | \- vkCmdBeginRenderPass(Clear) | 27 | 78 | - vkCmdSetViewport(0, { { 0.00, 0.00, 2696.00, 1260.00 } }) | | 79 | - vkCmdSetScissor(0, { { { 0, 0 }, { 2696, 1260 } } }) | | 80 | - vkCmdBindDescriptorSets(0, { Descriptor Set 1012 }) | | 81 | - vkCmdBindPipeline(Graphics Pipeline 1022) | | 82 | - vkCmdDraw(3, 1) | 27 | 1208.92313 83 | - vkCmdEndRenderPass(Store) | 28 | 107 | - Colour Pass #4 (1 Targets + Depth) | 28-31 | 8441.61577 84 | \- vkCmdBeginRenderPass(C=Clear, DS=Clear) | 28 | 85 | - vkCmdSetViewport(0, { { 0.00, 0.00, 2696.00, 1260.00 } }) | | 86 | - vkCmdSetScissor(0, { { { 0, 0 }, { 2696, 1260 } } }) | | 87 | - vkCmdBindDescriptorSets(0, { Descriptor Set 1015 }) | | 88 | - vkCmdBindPipeline(Graphics Pipeline 1018) | | 89 | - vkCmdDraw(3, 1) | 28 | 2864.61552 90 | - vkCmdSetViewport(0, { { 0.00, 0.00, 2696.00, 1260.00 } }) | | 91 | - vkCmdSetScissor(0, { { { 0, 0 }, { 2696, 1260 } } }) | | 92 | - vkCmdBindPipeline(Graphics Pipeline 254) | | 93 | - vkCmdBindDescriptorSets(0, { Descriptor Set 252 }) | | 94 | - vkCmdPushConstants(VK_SHADER_STAGE_VERTEX_BIT, (16 bytes)) | | 95 | - vkCmdBindVertexBuffers(0, { Buffer 2529 }) | | 96 | - vkCmdBindIndexBuffer(Buffer 2532, VK_INDEX_TYPE_UINT16) | | 97 | - vkCmdSetScissor(0, { { { 0, 0 }, { 2696, 1260 } } }) | | 98 | - vkCmdDrawIndexed(159, 1) | 29 | 2800.30782 99 | - vkCmdSetScissor(0, { { { 24, 43 }, { 237, 283 } } }) | | 100 | - vkCmdDrawIndexed(771, 1) | 30 | 2776.69243 101 | - vkCmdEndRenderPass(C=Store, D=Store, S=Don't Care) | 31 | 102 | - => vkQueueSubmit(1)[0]: vkEndCommandBuffer(Baked Command Buffer 4863) | 31 | 103 | - vkQueuePresentKHR(Swapchain Image 4778) | 31 |
Not sure what RenderDoc is doing under the hood here, but in general a lot of timing techniques only really make sense for an immediate-mode GPU architecture. The fact that RenderDoc seems to be trying to time individual draw commands makes me think that's it's relatively invasive in terms of what it's doing.