This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Zero Copy Buffer Allocation on Arm Mali MidGard GPUs Opencl1.2

I am trying to allocate a zero copy buffer on  Mali Midgard GPUs . The OpenCL 1.2 guide mentions that the only sure shot way to do this is to use the flag

CL_MEM_ALLOC_HOST_PTR

SO, First we need to allocate the Gpu memory using the flag and then perform map and unmap operations. The issue is I wish to do it the other way around i.e. first allocate CPU memory ( c arrays etc.) and then just point its pointer to the clBuffer flag.  In Intel Integrated Graphics, the way I do this is by first allocating a memory in proper aligned form - alignment = 4096 and size = multiple of 64 bytes. Then, I use the flag USE_MEM_HOST_PTR and map unmap operations to do zero copy operations. How to do the same for arm mali gpus and cpus for hybrid operations. Kindly help. And What's the alignment requirement for Arm GPUs?

  • Hi,

    You can use our vendor memory import extension: https://www.khronos.org/registry/OpenCL/extensions/arm/cl_arm_import_memory.txt

    The pointer will have to be aligned to 64 bytes on platforms that don't support system coherency. Ideally the size should be a multiple of 64 bytes as well.

    Using CL_MEM_USE_HOST_PTR will currently result in a copy in most situations. We are looking at using the above memory import mechanism under the hood to avoid the copy where possible.

    Hope this helps, let me know if you have other questions.

    Regards,

    Kévin

  • Thanks for the quick reply. Actually I am work on top of someone else's  codebase. The way the CPU code was written, it was utilising std::vector. To keep utilising std::vectors, I wrote a custom allocator allocating the memory in proper alignment (4096, 64) and then pointed the clBuffer to it using USE_MEM_HOST_PTR for Intel Integrated Graphics. This way I achieved zero copy behaviour on that platform. I wish to do a similar thing, i.e keep using sd::vectors even on ARM. Will cl_arm_import_memory work with pointer to a std::vector? Or is there any other alternative? Kindly help.

  • Yes, importing a std::vector's data pointer is supported. You can even reuse the same custom allocator (or relax the alignment constraints if you want).

    Regards,

    Kévin

  • Hi, while going through the txt file, I came across this line .I don't understand the meaning of this. If I have the opencl library and use c headers from khronos, won't it work? or Am I missing something? Kindly help.

          If the extension string cl_arm_import_memory_host is exposed then importing
          from normal userspace allocations (such as those created via malloc) is
          supported.