Hello All,
I am developing OpenCL code for different devices. At the current time I work with Rockchip RK3588 (OpenCL device - Mali-G610 r0p0). The program algorithm was originally written on CUDA, where the warp size is 32. In OpenCL this value is named "sub-work group size" (count Work-Items running in the current time). Also, this value can get from the value CL_KERNEL_PREFERRED_WORK_GROUP_SIZE_MULTIPLE. For example on Intel GPU I can set this value uses "__attribute__((intel_reqd_sub_group_size(32)))". And now on "Mali-G610 r0p0" I get "CL_KERNEL_PREFERRED_WORK_GROUP_SIZE_MULTIPLE = 16", but the program work is not correct, I need to change this value to 32. Perhaps someone can help me with this?
ClInfo returned me the next info:................Preferred work group size multiple (device) 16Preferred work group size multiple (kernel) 16Max sub-groups per work group 64................