Please note: We are aware of an issue affecting replies on the Arm Community forums, which may not be loading as expected.
We apologize for any inconvenience and appreciate your patience while we investigate and work to resolve the issue.
Thank you for your understanding.
Hi Guys,
I am actively optimizing my OpenCL program running on Mali Bifrost GPU.
I wonder whether cache prefetch would have some performance improvements for my program.
I didn't find any materials about mali GPU prefetching, however, I found there is one built-in function (i.e., prefetch()) in OpenCL standard.
After my trying, this prefetch() function has no effect on my program's performance. Thus, I think maybe the prefetch function
is just an empty implementation on mali GPU
Does some one know:
1. Is there any prefetch mechanism on Mali Bifrost GPU even though there is an OpenCL related function prefetch()?
2. If the prefetch does have benefits for performance, how can I use it preperly?
Thank you so much for answering my questions in advance!
zengzeng.sun said:Thus, I think maybe the prefetch function is just an empty implementation on mali GPU
You guess correctly - there is no prefetch on Mali GPUs, so the prefetch hints are ignored.
Kind regards,
Pete