This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Mali T628 user space driver performance

Hi,

I have an OpenCL program that runs on a Mali T628 platform. Originally I was using kernel driver r5p1-00rel1 and user driver r5p0-06rel0. But I was constantly hitting what seemed like a deadlock in the close-source user driver. And my program would hang from time to time:

#0  0xb5394b24 in __lll_lock_wait () from /lib/libpthread.so.0

#1  0xb538ec48 in pthread_mutex_lock () from /lib/libpthread.so.0

#2  0xb5e8fe2c in cmem_tmem_heap_alloc () from /usr/lib/libOpenCL.so

#3  0xb5e9e356 in build_single_job () from /usr/lib/libOpenCL.so

#4  0x00000000 in ?? ()

#0  0xb5e8b9b0 in add_to_bin () from /usr/lib/libOpenCL.so

#1  0xb5e8bad4 in alloc_block () from /usr/lib/libOpenCL.so

#2  0xb5e8bd48 in cmemp_heap_alloc () from /usr/lib/libOpenCL.so

#3  0xb5e8fe40 in cmem_tmem_heap_alloc () from /usr/lib/libOpenCL.so

#4  0xb5e9e356 in build_single_job () from /usr/lib/libOpenCL.so

#5  0x00000000 in ?? ()

Since I can't see the source thus can't debug, I tried switching to new versions of the driver (kernel r6p0-02rel0 and user r6p0-02rel0). This apparently solved the deadlock issue, but now my program is getting a hit in performance by about +15%. This is with exactly the same environment, the only difference being the kernel/user driver.

The program uses the 2 Mali devices simultaneously in a single device context.

Anyone hitting the same problem or can suggest a potential solution? Thanks.

Parents
  • Hi,

    The deadlock is likely to be caused by the mismatch between your kernel and userspace versions.

    Without your source code and more details about your platform I can't really help more than that.

    There was a change in the CPU cache maintenance routines between r5p0 and r6p0 which might have caused the difference in performance but in more recent versions of the driver the performance should now be back to what it used to be.

    Hope this helps,

    Regards,

    Anthony

Reply
  • Hi,

    The deadlock is likely to be caused by the mismatch between your kernel and userspace versions.

    Without your source code and more details about your platform I can't really help more than that.

    There was a change in the CPU cache maintenance routines between r5p0 and r6p0 which might have caused the difference in performance but in more recent versions of the driver the performance should now be back to what it used to be.

    Hope this helps,

    Regards,

    Anthony

Children
No data