Irregular behaviour of vectors in OpenCL(1.2) kernels

So, I am trying to perform some operation inside an OpenCL kernel. I have this buffer named filter which is a 3x3 matrix initialized with value 1.

I pass this as an argument to the OpenCL kernel from the host side. The issue is when I try to fetch this buffer on the device side as a float3 vector.  For ex -

 

__kernel void(constant float3* restrict filter)
{
        float3 temp1 = filter[0];
        float3 temp2 = filter[1];
        float3 temp3 = filter[2];
}

The first two temp variables behave as expected and have all their value as 1. But, the third temp variable (temp3) has only the x component as 1 and rest of the y and z components are 0. When I fetch the buffer as only a float vector, everything behaves as expected. Am I doing something wrong? I don't want to use vload instructions as they give an overhead.

More questions in this forum