Hi Guys,
I am actively optimizing my OpenCL program running on Mali Bifrost GPU.
I wonder whether cache prefetch would have some performance improvements for my program.
I didn't find any materials about mali GPU prefetching, however, I found there is one built-in function (i.e., prefetch()) in OpenCL standard.
After my trying, this prefetch() function has no effect on my program's performance. Thus, I think maybe the prefetch function
is just an empty implementation on mali GPU
Does some one know:
1. Is there any prefetch mechanism on Mali Bifrost GPU even though there is an OpenCL related function prefetch()?
2. If the prefetch does have benefits for performance, how can I use it preperly?
Thank you so much for answering my questions in advance!
zengzeng.sun said:Thus, I think maybe the prefetch function is just an empty implementation on mali GPU
You guess correctly - there is no prefetch on Mali GPUs, so the prefetch hints are ignored.
Kind regards,
Pete