Why does "vld4_lane" intrinsic have three arguments?

From "Neon Intrinsics Reference" (ARM DEN0018A)

D.9.11 VLD4_LANE
Result_t vld4_lane_type(Scalar_t* N, Vector_t M, int n);
Result_t vld4q_lane_type(Scalar_t* N, Vector_t M, int n);
Related Instruction
VLD4.dt {Dd[x], Dd+1[x], Dd+2[x], Dd+3[x]},[Rn]
VLD4.dt {Dd[x], Dd+2[x], Dd+4[x], Dd+6[x]},[Rn]
VLD1.64 {Dd, Dd+1, Dd+2, Dd+3},[Rn]

As stated above, VLD4_LANE intrinsic translates to the instruction vld4. But I simply cannot figure out why there are three arguments.

All I would ever need are the source address, destination registers, and the index. What's the second argument "Vector_t M' for?

Let's assume "int16x4x4_t a;" is mapped to d0 to d3, then how should I express the following instruction in intrinsic?

vld4.16 {d0[0], d1[0], d2[0], d3[0]}, [pSrc]

Maybe "a= vld4_s16(pSrc, a, 0);" ????

I'm really riddled, and would appreciate any help.

Thanks in advance.