Hello, Since lack of local memory in Mali, I am trying to use subgroups as Intel does in clDNN library, although they have local memory but registers exchange even faster than local memory. I have three questions about subgroups in Bifrost and Valhall implementation of OpenCL:
Awesome! Thank you so much!