Facing issue while Prefetching the data

a0 = vld1_u16 (&p[width*0]); // a0 - a10 = 16x4 vector and width is 32 bit integer
a1 = vld1_u16 (&p[width*1]);
a2 = vld1_u16 (&p[width*2]);
a3 = vld1_u16 (&p[width*3]);
a4 = vld1_u16 (&p[width*4]);
a5 = vld1_u16 (&p[width*5]);
a6 = vld1_u16 (&p[width*6]);
a7 = vld1_u16 (&p[width*7]);
a8 = vld1_u16 (&p[width*8]);
a9 = vld1_u16 (&p[width*9]);
a10 = vld1_u16 (&p[width*10]);

All load instructions from a0 to a9 is loading properly inside y loop. But loading of a10 causing an issue i.e loading zeros to a10 vector.
The loading of a10 is happening if defined inside x for loop.

More questions in this forum