• Irregular behaviour of vectors in OpenCL(1.2) kernels
    So, I am trying to perform some operation inside an OpenCL kernel. I have this buffer named filter which is a 3x3 matrix initialized with value 1. I pass this as an argument to the OpenCL kernel from...
  • Zero Copy Buffers using cl_arm_import_memory extension in OpenCL 1.2 - arm mali midgard GPUs.
    Hi, I wish to allocate a vector and use it's data pointer to allocate a zero copy buffer on the GPU. There is this cl_arm_import_memory extension which can be used to do this. But I am not sure wether...
  • Optimised GPU convolution for low memory integrated devices -such as arm processors /GPUs?
    I wish to implement convolution on arm mali GPUs and want it to be optimised for both speed and memory ? What's the best way to do this? GEMM based MCMK convolutions are not suited as they utilise a lot...
  • Optimised OpenCL SGEMM implementation for ARM Mali Midgard GPUs.
    I wish to implement an optimised sgemm for Mali MidGard Gpu whichas of now only support OpenCL 1.2. As far as I know, OpenCL 1.2 doesn't support subgroup extensions and Mali GPUs don't have any benefits...
  • Texture instruction count limitation on Mali T-XXXX GPUs?
    Hi, I've been having a weird issue with some shaders. The shader program failed at linking time but no GL errors in the log. After a lot of testing it seems like I am hitting an undocumented issue related...