NVIDIA is ratifying this extension to work with OpenGL ES 3.x and they are exposing it with their latest video drivers (Which are only available on the X1 development platform).
My application takes advantage of the desktop OpenGL variant(GL_ARB_buffer_storage) to dramatically reduce CPU overhead from calling in to the OpenGL API.
This also allows us to easily decode our data directly in to GPU buffers while rendering from it with the GPU. Making sure we don't overwrite information in flight by having multiple frames of data of course. For UMA systems like those that run Mali GPUs this is a big win for us, especially with how little bandwidth is available on these compared to what we have available to us on desktops.
Hopefully you'll think about implementing support for this extension.
Since we are a performance oriented application, we support buffer updating multiple ways depending on how efficient it is on different platforms.
Currently we support six different ways of updating buffers.
The most efficient way for us is if they driver exposes GL_{ARB, OES, EXT}_draw_elements_base_vertex alongside GL_{ARB, EXT}_buffer_storage.
If the driver doesn't expose that path we fall back to other ways of updating our buffers, typically glMapBufferRange, glBufferSubData, or glBufferData in descending order from most efficient to least efficient.
Of course each of these methods are used under varying circumstances, say if the driver doesn't expose base_vertex then we can only update our buffers with glBuffer{Sub,}Data.
Then we take it further and determine if we are on a deferred renderer and if we are then fallback to only glBufferData. Unless of course they support base_vertex which then it becomes more efficient to update the buffers with glMapBufferRange with the unsync flag being set.
Again of course, we support all these methods to make sure we get the most efficient buffer updating as possible, and yes the persistent mapping gives us quite an advantage since it lowers CPU overhead quite a bit since we don't need to call in to the driver constantly. We do a very large sum of buffer updates which can definitely get us bound up in API overhead.
This definitely shows on the Nexus 9 where the drivers have been forced to only a GLES 3.1 subset without base_vertex and buffer_storage, but can be hacked around to use the functions still.