NVIDIA is ratifying this extension to work with OpenGL ES 3.x and they are exposing it with their latest video drivers (Which are only available on the X1 development platform).
My application takes advantage of the desktop OpenGL variant(GL_ARB_buffer_storage) to dramatically reduce CPU overhead from calling in to the OpenGL API.
This also allows us to easily decode our data directly in to GPU buffers while rendering from it with the GPU. Making sure we don't overwrite information in flight by having multiple frames of data of course. For UMA systems like those that run Mali GPUs this is a big win for us, especially with how little bandwidth is available on these compared to what we have available to us on desktops.
Hopefully you'll think about implementing support for this extension.
I also going with the assumption that persistent mapping is what the op is interested in..Though its a nice feature to have, developers have been doing without persistent mapping before and after its introduction. The explanation giving above works fine as I have used them myself, using a ring buffer or orphaning won't be necessarily slower than a persistent mapping( mileage may vary and one would have to profile both method to see ). A chapter in OpenGL Insight covers experiments done with a few of the strategy listed above ( but for OpenGL not OpenGL ES ) and you would be surprised. Persistent mapping is not a panacea either as one would still have to worry about synchronization and that also add overhead. Last but not least, I don't think designing an application around a single extension is good design practice as now you are limiting the number of device the application can run on, unless the core of the design is to run on just system X.
Since we are a performance oriented application, we support buffer updating multiple ways depending on how efficient it is on different platforms.
Currently we support six different ways of updating buffers.
The most efficient way for us is if they driver exposes GL_{ARB, OES, EXT}_draw_elements_base_vertex alongside GL_{ARB, EXT}_buffer_storage.
If the driver doesn't expose that path we fall back to other ways of updating our buffers, typically glMapBufferRange, glBufferSubData, or glBufferData in descending order from most efficient to least efficient.
Of course each of these methods are used under varying circumstances, say if the driver doesn't expose base_vertex then we can only update our buffers with glBuffer{Sub,}Data.
Then we take it further and determine if we are on a deferred renderer and if we are then fallback to only glBufferData. Unless of course they support base_vertex which then it becomes more efficient to update the buffers with glMapBufferRange with the unsync flag being set.
Again of course, we support all these methods to make sure we get the most efficient buffer updating as possible, and yes the persistent mapping gives us quite an advantage since it lowers CPU overhead quite a bit since we don't need to call in to the driver constantly. We do a very large sum of buffer updates which can definitely get us bound up in API overhead.
This definitely shows on the Nexus 9 where the drivers have been forced to only a GLES 3.1 subset without base_vertex and buffer_storage, but can be hacked around to use the functions still.