Hi,
It is a question i asked on khronos forum but i had no answer. So i try to ask the question here.
When i try to use 64bit libraries to build my APK i always got the same error on my booth phonne hauwei honnor play and xiaomi 13T pro.
If someone could explain me why.
It is impossible to give any specific advice without a minimal reproducer with some detail of the kernel and how you are calling the API call.
This error is CL_INVALID_ARG_SIZE which is returned if ...
CL_INVALID_ARG_SIZE if arg_size does not match the size of the data type for an argument that is not a memory object or if the argument is a memory object and arg_size != sizeof(cl_mem) or if arg_size is zero and the argument is declared with the local qualifier or if the argument is a sampler and arg_size != sizeof(cl_sampler).
CL_INVALID_ARG_SIZE
sizeof(cl_mem)
sizeof(cl_sampler)
If this works on a 32-bit build, I'd assume that you are using sizeof() of a C data type which is mismatched with the CL type so you get a size difference when switching bitness due to the two type sizes diverging, but I'm guessing at this point.
If this doesn't help, can you share more specifics about what you are doing?
that is what i do:
declare buffer :
static cl::Buffer bufferNV21;static cl::Buffer bufferRGBA;init buffer:
bufferligne = cl::Buffer(gContext, CL_MEM_USE_HOST_PTR, (1024*1024)*sizeof(cl_uchar4), buf, NULL);bufferRGBA = cl::Buffer(gContext, CL_MEM_READ_WRITE , (1080*1920)*sizeof(cl_uchar4));feed buufer: this is workingin => uint8_t* inbufferNV21 = cl::Buffer(gContext, CL_MEM_READ_ONLY|CL_MEM_USE_HOST_PTR , isize*sizeof(cl_uchar), in, NULL);
kernel :
__kernel void nv21torgba( __global int* debug, __global int* debugIdx, __global uchar4* Gray, __global uchar4* RGBA, __global uchar4* HSV, __global uchar4* HSL, __global uchar4* HSI, __global uchar4* MMM, __global uchar* in, int im_width, int im_height, int im_ligth)
and the enqueueNDRangeKernel how cause the problem: (they all cause problem)
gNV21Kernel.setArg(0,0); // debuggNV21Kernel.setArg(1,0); // debugIdxgNV21Kernel.setArg(2,0); // GRAYgNV21Kernel.setArg(3,bufferRGBA); // output RGBAgNV21Kernel.setArg(4,0); // HSVgNV21Kernel.setArg(5,0); // HSLgNV21Kernel.setArg(6,0); // HSIgNV21Kernel.setArg(7,0); // MMMgNV21Kernel.setArg(8,bufferNV21); // input NV21gNV21Kernel.setArg(9,w);gNV21Kernel.setArg(10,h);gNV21Kernel.setArg(11,ligth);gQueue.enqueueNDRangeKernel(gNV21Kernel, cl::NDRange(448,28), cl::NDRange(1024,1024), cl::NDRange(2,2),0,&arraySecondEvent);
Has you see i use the C++ version and not C. The input data are in uchar and the output in uchar4.Everything work well in 32bit. So it is a type problem, but i never find neither anderstoud where come from this error.
And i am using the last CL include from Khronos.I think that the problem could come from the kernel compilation. And this is a C compilation that could be compiled with C++ compiler.It is just an idéa that i cannot confirm bacause with android a do not know which compiler is used.
What CPU-side data type are you using for "w", "h", and "ligth" in code below?
gNV21Kernel.setArg(9,w);gNV21Kernel.setArg(10,h);gNV21Kernel.setArg(11,ligth);
... my best guess is that you have these as a "long", so the C++ template in the header is matching the type size as 64-bits on a the 64-bit system, but 32-bits on the 32-bit system.
Given that this MUST be a 32-bit type for the template to match the OpenCL kernel type size, I suggest using the int32_t type from the stdint header, to avoid the variable type size problems.
int w, int h, int ligthI avoid to use long as much as possible.i did not anderstoud:Given that this MUST be a 32-bit type for the template to match the OpenCL kernel type size, I suggest using the int32_t type from the stdint header, to avoid the variable type size problems.
hi,
i replaced the int by int32_t for the w,h and ligth variables. But still the same problem.
i tried to replace the int to int32_t in the kernel but the kernel compilation failed with (error: unknown type name 'int32_t')
I cannot add the stdint.h file in the .cl kernel file, i got file not found. So i copied the stdint.h, stddef.h and compiler.h to the kernel and use the int32_t to replace the int. But i still got the same error CL_INVALID_ARG_SIZE.
So the problem should come from the cl::Buffer which has the wrong size ?
I have found the problem after few weeks of interrogation.
The problem is because i use kernel.setArg(x,0)
it look like using 64bit it does not like the set to 0 in setArg.
But i got no idea why this append. So, it would be nice if some one could explain m why such a Bug. It is comming from the compiler, the driver or something else.
I tried to replace 0 by NULL but same error. What could i do if i want to keep my kernel without cange ?
And let me no if there is no solution except modifiyng the kernel without setARG with 0.
Regards.
PS: i got no improvment in speed and some kernel that used range(16,16) need to be set to (2,2). Strange, i got the same problem with the xiaomi 13T pro. Hauwei seems to be the best even with 7 years old. How can thing be worse with time ?
i can get between 39 and 45ms with the hauwei by setting the batterie to performance, so 30% improvement and 22 frame seconde for big GPU and CPU traitement.
The conclusion is that 32bit give must better result than 64bit. Not surprising. That is for android phonne, i do not know for other platform. So i will keep working in 32bit. it is more efficient.
new code :
//gNV21Kernel.setArg(0,0); // debug//gNV21Kernel.setArg(1,0); // debugIdx//gNV21Kernel.setArg(0,0); // GRAYgNV21Kernel.setArg(0,bufferRGBA); // output RGBA//gNV21Kernel.setArg(2,0); // HSV//gNV21Kernel.setArg(4,0); // HSL//gNV21Kernel.setArg(5,0); // HSI//gNV21Kernel.setArg(6,0); // MMMgNV21Kernel.setArg(1,bufferNV21); // input NV21gNV21Kernel.setArg(2,w);gNV21Kernel.setArg(3,h);gNV21Kernel.setArg(4,ligth);gQueue.enqueueNDRangeKernel(gNV21Kernel, cl::NDRange(448,28), cl::NDRange(1024,1024), cl::NDRange(2,2),0,&arraySecondEvent);new kernel :
__kernel void nv21torgba( //__global int* debug, //__global int* debugIdx, //__global uchar4* Gray, __global uchar4* RGBA, //__global uchar4* HSV, //__global uchar4* HSL, //__global uchar4* HSI, //__global uchar4* MMM, __global uchar* in, int im_width, int im_height, int im_ligth)
Thank you for coming back to share the solution you found :)