This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Is it possible to short-circuit a submitted draw based on time

This is an interesting question: Is it possible to query a timestamp and in doing so, determine if vertex/fragment shaders should return prematurely, thus rushing to the end of the drawcall if a particular time has elapsed? Put another way, can we use a timestamp to somehow short-circuit a submitted draw call? For example, could we issue millions of instances in a call and 'end' the call (read a timestamp in shader) when we got to, say, 14ms, thus keeping the rendering time to 60fps?

Would the workload have to be split into batches and the "time query" have to be done prior to issuing each call?

Sean

  • If there is an absence of time-queries, one possible solution may be to use a Buffer Objects to store either a time, or a code to indicate that a time threshold has elapsed. The CPU will thus be required to either update this periodically, or to update this at a particular time threshold. Each shader program designed to short-circuit (ie. to cancel work) after the threshold will read and test the time/threshold-flag from the buffer object and either fail the test, or return. If the time has elapsed, the shader can exit early without doing its workload.

    Thoughts and ideas would be most welcome!

  • Sean, it's not quite exactly what you've asked for, but you may be interested in the "EGL_IMG_context_priority" extension to EGL.  With that extension you can declare the priority of a GLES context to HIGH, MEDIUM, or LOW.

    I do understand you've asked for the ability to *cancel* a draw call if too much time has passed, but is the underlying question is how to interrupt an existing draw call to get some other GL work done?  If so, you could make your existing (potentially long) draws with a MEDIUM or LOW priority context, and then issue your other GL work in a HIGH priority context and the driver may interrupt the LOW priority context to complete the HIGH priority calls.

    (One use for this extension is to have a long-running render in a low-priority thread/context (like 100ms), and a short render (like 8ms) in another high-priority thread that uses the output of the long-running render as a source texture.  That way you can always update the screen with motion or HUD contents at a high-frame rate, even if (say) a scene render takes a long time.)

    What are your target platforms?

    (Sorry I don't have a direct answer about how to cause a shader to terminate early based on time.)

  • Thanks bradgrantham!

    Your answer is very welcome, tremendously useful, and very enlightening. I was unaware of the IMG_context_priority extension (which I assume Mali supports) which seems tremendously useful -- Thank you!. You are correct that this isn't exactly what I had in mind, but if the implementation is reliable, it can stand in and produce the same results as a full-cancel under many circumstances, and with a potentially more straight-forward implementation. I am very grateful for the suggestion, and will be thinking as to weather this will suit my needs -- I suspect it might!

    I haven't started my graphics project just yet but will be targeting Samsung's Gear VR hardware which implies Mali and potentially Adreno. I'm really excited to get started!

    Sean

  • Please take this post with a hint of skepticism, I am still very much a student of GLES.

    After much research, it does not seem possible to stop a draw command in flight or after it has been submitted for processing. As mentioned above, you can setup a "exit-early" test in your vertex shaders based on some externally set value to avoid doing the work and quickly exit. This, unfortunately, has at least two disadvantages. The first is that you must extend your shader by a few cycles to test for the exit-early condition, and to cull the vertex if the exit-early condition succeeds. The second disadvantage is one of magnitude: if you are trying to to exit-early with a large number of verticies (eg. 1,000,000) it implies that each vertex must be tested and culled. This grows more of a problem the larger your cancel-able set grows!

    But there might be a more straight forward way. But first a bit of preamble.

    Most of the literature regarding issuing commands to draw to a buffer seems to revolve around eglSwapBuffers as a final command. From the developers perspective, the invocation of this function marks the moment a buffer is ready to be transferred for display. For the driver, however, this command can often mark the moment that the actual GPU rendering can begin! And this makes sense: the CPU is busy queuing up commands for drawing, and then all at once, the actual drawing can commence. As all work has been submitted, this allows the driver to organize and selectively execute the workload in an effort to maximum efficiency. In a typical application using multiple buffers, this work can happen completely in the background, leaving the CPU to deal with a future frame. peterharris has a tremendous article on this very process and the benefits: The Mali GPU: An Abstract Machine, Part 1 - Frame Pipelining .

    Of course there is a really important caveat that may go unnoticed with this form of command pipelining: you must submit all of your work prior to frame rendering. For most rendering scenarios, this is perfectly acceptable as most times developers will know exactly what is to go into each scene. For the times when this is not, it becomes a problem.

    Enter glFlush. glFlush is a command that forces the current GL command buffer to be processed by the GPU and therefore allows you to submit a workload incrementally and have it processed incrementally. Put another way, the entire scene need not be submitted before the GPU begins processing it.

    But this comes at a cost. On the CPU, glFlush will "block" or wait for all existing, buffered commands to be added to GPUs task to-be-processed list, but it will not wait for the GPU to finish this workload (that behaviour is reserved for glFinish). glFlush likely involves a bit of GPU penalty as well. This may be especially true with tile-based renderers that do setup to maximize performance and bandwidth efficiency and are likely to be more effective given a full scene as opposed to a partial one. The actual implementation and behaviour of glFlush is likely architecture and driver specific.

    Why would you want to use this command? I do not know. But there may be situations that call for this behaviour and your reasons may be perfectly valid. You may want to submit as large a number of small rendering jobs as will likely fit in a specific time interval (as is the case with the original problem). Alternatively, you may desire to run a small set of benchmarking/test jobs along-side your shipped app to collect information about the architectures of your users.

    glFlush is supported as far back as GL ES 1.1, and is supported up to GL ES 3.1.

    Corrections would be most welcome, so that I can edit this post (I would hate to mislead anyone), as are 'likes' for support!

    Sean

  • It has been brought to my attention that while glFlush will work, it incurs very large bandwidth consumption penalty (if writing to one frame) due to having to repeatedly read/write the FBO to memory. The steep cost may (very likely) offset the benefits of incremental rendering to maximize workloads. Use of glFlush has thus been discouraged, and understandably so given this consequence. Of course, this is not the case if you are writing to different buffers.

    To optimize for memory, the developer may consider using small sections of FB at a time, to exploit spatial coherence to maximize caches. Outside of that, it may be worth looking to the GLES 3.1 compute shader to implement these edge cases, even if they mean re-implementing certain aspects of the GL pipeline.

    Sean