This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Is the array object in OpenCL kernel mapping to global memory instead of register?

Hi,

I am trying to implement an OpenCL kernel on G76 with DDK r16.

I find that if I define and use an array like "half A[16];", the performance will be poor.

But if I use "half16 A;", the performance is very good.

I wonder if array is mapping into global memory so that the performance is poor when using array?

However, I need to use array instead of vector because the algorithm needs to index the ith element "A[i]" in a for loop.

I think it is impossible to use vector in such a way "A.si".

Can anyone help me?

Thank you very much in advance!

Parents
  • A lot depends what your code looks like and how the array is being used at runtime. Some arrays can be promoted to registers, some cannot - it all depends on what the kernel code is doing with them.

    Can you share a kernel?

    Cheers,
    Pete

Reply
  • A lot depends what your code looks like and how the array is being used at runtime. Some arrays can be promoted to registers, some cannot - it all depends on what the kernel code is doing with them.

    Can you share a kernel?

    Cheers,
    Pete

Children