This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Contiguous memory map to Malis

We have two Malis on our board Odroid XU4 .  We wish to create a large image, with one Mali creating half the image and the other Mali creating the other half.  We also want the image to be memory mapped, as it is quite large.  Can we map the image in such a way so that both Malis see at least the part they are working on, and, of course, the memory for the whole image is contiguous for the Cpu?

A rephrasing of the question might be:  May different devices see the same shared memory with the host?

One possible way (but would like to confirm before moving forward), involves using buffers instead of images:

1. Create entire buffer  in the context with CL_MEM_ALLOC_HOST_PTR

2. Create two disjoint sub-buffers. (did not see a way to create sub-images)

3. Map each sub-buffer on its own command queue.

However, this depends on when the memory gets allocated on the host:

Is memory allocated in step 1. or in step 3.?  If in step 1, the memory will be contiguous on the host.  If in step 3, ... ?  I suspect in step 3, since that is when the host ptr becomes available.

Message was edited by: Norman Goldstein

Parents
  • Thanks for the info and pointers.  Here is an outline of what I

    understand from this:

    -- A single context having the two devices: device0 and device1

        and two queues: queue0 and queue1

        // The float single channel image that we want to generate

    -- image = clCreateImage2D( context,

                                                     CL_MEM_WRITE_ONLY |

    CL_MEM_ALLOC_HOST_PTR,

                                                     ...,

                                                     nullptr, // host ptr

                                                     ... );

         // Map the entire image

    -- float* ptr = clEnqueueMapImage( queue0,

                                                                image,

                                                                ... );

    After running the kernels, ptr will point to the entire (contiguous)

    image, as created by the kernels of the two devices. We could have used

    "queue1" instead of "queue0" to do the mapping -- it makes no

    difference, due to the Mali memory architecture.

Reply
  • Thanks for the info and pointers.  Here is an outline of what I

    understand from this:

    -- A single context having the two devices: device0 and device1

        and two queues: queue0 and queue1

        // The float single channel image that we want to generate

    -- image = clCreateImage2D( context,

                                                     CL_MEM_WRITE_ONLY |

    CL_MEM_ALLOC_HOST_PTR,

                                                     ...,

                                                     nullptr, // host ptr

                                                     ... );

         // Map the entire image

    -- float* ptr = clEnqueueMapImage( queue0,

                                                                image,

                                                                ... );

    After running the kernels, ptr will point to the entire (contiguous)

    image, as created by the kernels of the two devices. We could have used

    "queue1" instead of "queue0" to do the mapping -- it makes no

    difference, due to the Mali memory architecture.

Children