This is an interesting question: Is it possible to query a timestamp and in doing so, determine if vertex/fragment shaders should return prematurely, thus rushing to the end of the drawcall if a particular time has elapsed? Put another way, can we use a timestamp to somehow short-circuit a submitted draw call? For example, could we issue millions of instances in a call and 'end' the call (read a timestamp in shader) when we got to, say, 14ms, thus keeping the rendering time to 60fps?
Would the workload have to be split into batches and the "time query" have to be done prior to issuing each call?
Sean
Please take this post with a hint of skepticism, I am still very much a student of GLES.
After much research, it does not seem possible to stop a draw command in flight or after it has been submitted for processing. As mentioned above, you can setup a "exit-early" test in your vertex shaders based on some externally set value to avoid doing the work and quickly exit. This, unfortunately, has at least two disadvantages. The first is that you must extend your shader by a few cycles to test for the exit-early condition, and to cull the vertex if the exit-early condition succeeds. The second disadvantage is one of magnitude: if you are trying to to exit-early with a large number of verticies (eg. 1,000,000) it implies that each vertex must be tested and culled. This grows more of a problem the larger your cancel-able set grows!
But there might be a more straight forward way. But first a bit of preamble.
Most of the literature regarding issuing commands to draw to a buffer seems to revolve around eglSwapBuffers as a final command. From the developers perspective, the invocation of this function marks the moment a buffer is ready to be transferred for display. For the driver, however, this command can often mark the moment that the actual GPU rendering can begin! And this makes sense: the CPU is busy queuing up commands for drawing, and then all at once, the actual drawing can commence. As all work has been submitted, this allows the driver to organize and selectively execute the workload in an effort to maximum efficiency. In a typical application using multiple buffers, this work can happen completely in the background, leaving the CPU to deal with a future frame. peterharris has a tremendous article on this very process and the benefits: The Mali GPU: An Abstract Machine, Part 1 - Frame Pipelining .
Of course there is a really important caveat that may go unnoticed with this form of command pipelining: you must submit all of your work prior to frame rendering. For most rendering scenarios, this is perfectly acceptable as most times developers will know exactly what is to go into each scene. For the times when this is not, it becomes a problem.
Enter glFlush. glFlush is a command that forces the current GL command buffer to be processed by the GPU and therefore allows you to submit a workload incrementally and have it processed incrementally. Put another way, the entire scene need not be submitted before the GPU begins processing it.
But this comes at a cost. On the CPU, glFlush will "block" or wait for all existing, buffered commands to be added to GPUs task to-be-processed list, but it will not wait for the GPU to finish this workload (that behaviour is reserved for glFinish). glFlush likely involves a bit of GPU penalty as well. This may be especially true with tile-based renderers that do setup to maximize performance and bandwidth efficiency and are likely to be more effective given a full scene as opposed to a partial one. The actual implementation and behaviour of glFlush is likely architecture and driver specific.
Why would you want to use this command? I do not know. But there may be situations that call for this behaviour and your reasons may be perfectly valid. You may want to submit as large a number of small rendering jobs as will likely fit in a specific time interval (as is the case with the original problem). Alternatively, you may desire to run a small set of benchmarking/test jobs along-side your shipped app to collect information about the architectures of your users.
glFlush is supported as far back as GL ES 1.1, and is supported up to GL ES 3.1.
Corrections would be most welcome, so that I can edit this post (I would hate to mislead anyone), as are 'likes' for support!
It has been brought to my attention that while glFlush will work, it incurs very large bandwidth consumption penalty (if writing to one frame) due to having to repeatedly read/write the FBO to memory. The steep cost may (very likely) offset the benefits of incremental rendering to maximize workloads. Use of glFlush has thus been discouraged, and understandably so given this consequence. Of course, this is not the case if you are writing to different buffers.
To optimize for memory, the developer may consider using small sections of FB at a time, to exploit spatial coherence to maximize caches. Outside of that, it may be worth looking to the GLES 3.1 compute shader to implement these edge cases, even if they mean re-implementing certain aspects of the GL pipeline.