Hi all
We are currently migrating an embedded application from a Mali 400MP2 Utgard platform to one with a Mali T720 Midgard GPU. The application uses the following (probably fairly common) mali_egl_image* code to achieve zero-copy update of a texture:
mali_egl_image*
EGLImageKHR eglImage = eglCreateImageKHR( display, EGL_NO_CONTEXT, EGL_NATIVE_PIXMAP_KHR, (EGLClientBuffer)(&fbPixMap), NULL );glEGLImageTargetTexture2DOES(GL_TEXTURE_2D, eglImage);......mali_egl_image *mimg = mali_egl_image_lock_ptr( eglImage );unsigned char *buffer = mali_egl_image_map_buffer( mimg, attribs_rgb );// update buffer heremali_egl_image_unmap_buffer( mimg, attribs_rgb );mali_egl_image_unlock_ptr( eglImage );
These mali_egl_image_* functions do not appear to be available in the mali_midgard driver we received from our chip vendor.
mali_egl_image_*
Our application is written in C with, apart from the above, standard openGL ES2 calls.
What would be the equivalent approach for updating a texture directly (ie not using glTexSubImage2D() ) with the T720 Midgard driver? Thankfully the above code exists in a single function and called from many places, so ideally a direct replacement would be fantastic!
glTexSubImage2D()
Regards
Chris
Thanks Ben, that's really helpful.
Our source images are raster (either RGB565 and RGBA8888) and we are now "uploading" to the GPU image/texture using the familiar:
glTexImage2D( GL_TEXTURE_2D, 0, GL_RGB, 800, 600, 0, GL_RGB, GL_UNSIGNED_SHORT_5_6_5, image )
In the previous Utgard version we copied and converted the RGB565 image straight into the mapped Mali texture as BGRA888 using an optimised Neon function, which was our only option really. Another part of the application (the UI) updates parts of textures, and now uses glTexSubImage2D where it too previously used the direct Mali texture mapping trick.
From what you have said, I wonder if we are falling foul of some inefficiencies. Does letting the Mali infrastructure copy and convert the image result in better and/or more optimal performance (resulting in textures stored as "cache optimal" perhaps)?
We want the GPU workload to be efficient, but at the same time require the source image to be copied for the render thread as fast as possible. I assume there will need to be a balance.
Thanks!
Yes I guess you've got a decision on which is the most important - the fast image copy, or fast image access thereafter. Keeping it linear will be very fast to copy (and it is what your Utgard version did if you want consistency), but you now have that potential much faster access if you create/convert the image as "cache optimal".
A colleague has pointed out that using glSync between your 2 threads reading from and writing to the image will be better than a full glFinish.
Fantastic. We'll look into glSync.
So that we can test and benchmark each image variant, and know what we're testing, could you confirm my understanding? :
If, as in out previous Utgard version, we create the image with eglCreateImageKHR() with EGL_NATIVE_PIXMAP_KHR and glEGLImageTargetTexture2DOES(), we will get a linear image?
If, as we're doing right now in the Midgard version, we create the image using glTexImage2D, we will get a "cache optimised" image?
If I've got that right, then what happens if we issue a glTexImage2D to replace the contents of an eglCreateImageKHR created image? Does it discard all of internal image attributes (such as the fact it is linear) and create a new "cache optimised" image, or will it merely reallocate the storage keeping the attributes and "linear" layout?
Thanks for taking the time to help us out with this - it is really important for us to understand, besides being very interesting.
Hi Chris,
I've clarified with the driver team, and glTexImage2D counts as a create rather than an import, so yes, it will change to "cache optimal" every time. glTexSubImage2D will not change the tiling, so if you want to keep linear you would need to use that.
As to how to get linear - if the allocator is the DDK it will be cache optimal. If the allocator is external and the image is imported it will be whatever the external allocator uses. For example, Android Gralloc will always allocate images using linear tiling whenever the image is host visible.
If you do it the old way with pixmaps it will end up linear as I understand it, yes.
Cheers, Ben
Ben, that's all been tremendously helpful. Thank you.