This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

MALI-400 : eglCreateImageKHR, EGL_GL_TEXTURE_2D_KHR and updating textures with the CPU

Hello everybody, I'm currently struggling with the said system.Is there somewhere a *full* sample code for Linux that does create an EGLImage for a texture and demonstrates how to update it with the CPU ?

Reference documentation seems ok but eglGetError keeps telling me I don't know what I'm doing

I won't post my various tries here because they don't work and therefore have no value for the reader, but I've been romaing the web and trying stuff for a while.

Cheers, Tramb

Parents
  • Hello Wasim and thanks for the help,

    I've already seen the thread you link but I don't have any header with mali_egl_image_lock_ptr in my toolchain, I can see these functions with IDA in my libMali.so but I have (of course) no idea about the prototypes.

    Do you confirm I have to use this mali_egl interface (through dynamic linking and pointer casting, maybe) ? If so I'd need the signatures.

    I found other code on the Internet and I tried to eglLockSurfaceKHR or eglQuerySurface EGL_BITMAP_POINTER_KHR and EGL_BITMAP_PITCH_KHR but without success.

    I'm quite confused about the direction to take.

    To sum up the bigger picture, I'm updating a texture from CPU every frame (no choice there) and I'm trying to avoid the costly texture swizzling in glTexSubImage2D.

    I didn't find a way to specify a linear texture, which would alleviate the cost.

    I didn't find a way to upload a pre-swizzled texture (which I could do sooner, NEONized and multithreaded myself if I knew/reversed the swizzling pattern)

    and last, hence my questions, I didn't find a way to simulate PBO operation to work in place and do my fence synchronization by myself, which would be the best (I guess) option for excellent Mali performance.

    (I'm quite used to low-level programming so OpenGL and even more so OpenGL ES 2 is always a struggle to fight higher level abstraction cost )

    Cheers,

    Bertrand

Reply
  • Hello Wasim and thanks for the help,

    I've already seen the thread you link but I don't have any header with mali_egl_image_lock_ptr in my toolchain, I can see these functions with IDA in my libMali.so but I have (of course) no idea about the prototypes.

    Do you confirm I have to use this mali_egl interface (through dynamic linking and pointer casting, maybe) ? If so I'd need the signatures.

    I found other code on the Internet and I tried to eglLockSurfaceKHR or eglQuerySurface EGL_BITMAP_POINTER_KHR and EGL_BITMAP_PITCH_KHR but without success.

    I'm quite confused about the direction to take.

    To sum up the bigger picture, I'm updating a texture from CPU every frame (no choice there) and I'm trying to avoid the costly texture swizzling in glTexSubImage2D.

    I didn't find a way to specify a linear texture, which would alleviate the cost.

    I didn't find a way to upload a pre-swizzled texture (which I could do sooner, NEONized and multithreaded myself if I knew/reversed the swizzling pattern)

    and last, hence my questions, I didn't find a way to simulate PBO operation to work in place and do my fence synchronization by myself, which would be the best (I guess) option for excellent Mali performance.

    (I'm quite used to low-level programming so OpenGL and even more so OpenGL ES 2 is always a struggle to fight higher level abstraction cost )

    Cheers,

    Bertrand

Children
  • I'm working on a R16 board, Cortex A7 with Mali400, without X11 support (fbdev instead).

    No libGLES_mali.so here

    # strings libMali.so | grep 'r[0-9]p[0-9]-'

    1.4 Linux-r4p0-00rel0

                Mali online shader compiler r4p0-00rel0 [Revision 96995].

    I know the swizzle to be costly (and CPU-bound)  because of my profiling with DS-5 Streamline.

    The offenders are _mali_convert_tex8_l_to_tex8_b if I do my palette lookup on the GPU and _mali_convert_tex32_l_to_tex32_b in the other case.

    I expect this to be the swizzling code (l_to_b => linear to block?) but it might be a wrong assumption.

    I'll just be rendering a full-screen quad every frame (maybe scaled with bilinear filtering), so I'm not sure that the swizzling is worth it (and if it is, I could probably do it with a lower latency than the implementation in libMali.so by using several cores and NEON code).

    We're just shipping on one definite platform so I'm definitely willing to specialize stuff to hit 60Hz and hardcode the swizzling pattern.

    I thought PBO was not an option because we have only OpenGL ES 2.0, which doesn't support it.

    But the main goal would be to have 0 copy and generate the texture in-place with the CPU, with double-buffering and fences to synchronize all this.

    When I first tried to address the problem I noticed that the following extensions are provided:

    EGL_KHR_image

    EGL_KHR_image_base

    EGL_KHR_gl_texture_2D_image

    EGL_KHR_reusable_sync

    EGL_KHR_fence_sync

    EGL_KHR_lock_surface

    EGL_KHR_lock_surface2

    and it seemed to me they were there to address exactly my problem. In a quite-portable way moreso, without using the mali_ namespace.

    I just can't for the life of me connect the dots, and that's why I'm asking for help.

    (I have looked at the dma stuff but EXT_image_dma_buf_import is not available for me)

    Thanks for reading all this !

    Bertrand

  • If dma_buf is not supported in your BSP the only other way is to use UMP.

    I will have to double check whether the magic code I have will work for that version of the driver or not but before I do that can you tell me if your driver is built with UMP?

  • I'm not sure, I'll ask the system team but their first gut feeling is "I don't think so".

  • I still don't have the answer. We contacted our supplier, ie your partner, to get some answers.

    When you're speaking about dma_buf vs UMP, you're thinking about the way to update device memory ?

    Is there a way to query the driver to know how glTexImage2D does it ? I was thinking the upload was done through DMA, after the swizzling is done on the CPU, but EXT_image_dma_buf_import is not exposed.

  • Hi tramboi,

    Going the dynamic linking route is not a good solution but could you provide the output of the following if you have access to libMali.so.

    strings libGLES_mali.so | grep 'r[0-9]p[0-9]-' or strings libMali.so | grep 'r[0-9]p[0-9]-'

    Which device are you using and where did you get the BSP from?

    Pre-Swizzled textures are too platform specific so I would avoid that. Also the swizzle patterns are not open.

    Why do you think texture swizzle is costly? Remember even if you have a 1:1 mapped texture onto the screen and you are using the texture multiple times in the frame you will still benefit from swizzle. Its very useful if you are drawing the texture in arbitrary orientation.

    If you don't have anything else to do while the upload happens then I understand otherwise why PBOs are not an option?

    Here are a few things you could try in the mean time. If your BSP has X11 support you could do a zero-copy upload of image data to OpenGL ES via Pixmaps. You will need to create an EGLImage from Pixmap and then Texture from the EGLImage. Your CPU code will have to find a way to fill the pixmap (not sure if this is possible).

    Another thing you could try is to use a Linux dma_buf file descriptor to create an EGLImage and then texture . More details https://www.khronos.org/registry/egl/extensions/EXT/EGL_EXT_image_dma_buf_import.txt. AFAIK this is also a zero-copy operation.

    HTH,

    Wasim

  • UMP driver is usually built as a kernel module so if you look into your kernel whether its loaded or not. You will get an answer. I am actually trying to see if this technique mentioned How to share texture memory between CPU/GPU for firefly's/rk3288 fbdev Mali-T764 will work with UMP or not.

    tramboi wrote:

    Is there a way to query the driver to know how glTexImage2D does it ?

    I don't think so.