Please note: We are aware of an issue affecting replies on the Arm Community forums, which may not be loading as expected.

We apologize for any inconvenience and appreciate your patience while we investigate and work to resolve the issue.

Thank you for your understanding.


This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Is the array object in OpenCL kernel mapping to global memory instead of register?

Hi,

I am trying to implement an OpenCL kernel on G76 with DDK r16.

I find that if I define and use an array like "half A[16];", the performance will be poor.

But if I use "half16 A;", the performance is very good.

I wonder if array is mapping into global memory so that the performance is poor when using array?

However, I need to use array instead of vector because the algorithm needs to index the ith element "A[i]" in a for loop.

I think it is impossible to use vector in such a way "A.si".

Can anyone help me?

Thank you very much in advance!

Parents Reply Children
No data