This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

VLD1 differences between each other

Note: This was originally posted on 30th October 2012 at http://forums.arm.com

Hi Everybody!

There are three instruction VLD1 in armeabi-v7a:
- VLD1 (multiple single elements) on page A8-898
- VLD1 (single element to one lane) on page A8-900
- VLD1 (single element to all lanes) on page A8-902

Does anybody know which differences between each others?
Also how compiler choose which type of VLD1 is it, because syntax seems completely equal.

Thanks in advance.
  • Note: This was originally posted on 31st October 2012 at http://forums.arm.com

    Exophase!

    Thnaks again for your fantastic answer!
  • Note: This was originally posted on 30th October 2012 at http://forums.arm.com

    VLD1 (multiple single elements) performs 1-4 sequential 64-bit loads to 1-4 64-bit NEON registers. It's like a normal load multiple instruction.
    VLD1 (single element to one lane) loads a single 8, 16, or 32-bit value to one lane of a vector. A lane is one element.
    VLD1 (single element to all lanes) is like the above but it copies the  load into all of the lanes, so the entire vector is updated.

    The syntax isn't really the same, because you use different notations for the registers in the register list. To update the entire vector with a vector load you use the vector name, like d0. To update one lane in the vector with a scalar load you subscript the lane number in the vector, like d0[1]. To update every lane with one scalar load you use the index notation without an index number, like d0[].

    Let's say that the address you're loading from contains the following, and register r0 points to it (is set to 0x0):


    0x0: 0x01
    0x1: 0x23
    0x2: 0x45
    0x3: 0x67
    0x4: 0x89
    0x5: 0xAB
    0x6: 0xCD
    0x7: 0xEF


    So this is what the code would do:


    // r0 = r1 = 0x0
    mov r1, r0

    vld1 { d0 }, [ r0 ]!
    // d0 as an 8x8 vector = [ 0x01, 0x23, 0x45, 0x67, 0x89, 0xAB, 0xCD, 0xEF ]
    // r0 = 0x8

    vld1.u8 { d0[5] }, [ r1 ]!

    // d0 as an 8x8 vector = [ 0x01, 0x23, 0x45, 0x67, 0x89, 0x01, 0xCD, 0xEF ]
    // r1 = 0x1

    vld1.u8 { d0[] }, [ r1 ]!

    // d0 as an 8x8 vector = [ 0x23, 0x23, 0x23, 0x23, 0x23, 0x23, 0x23, 0x23 ]
    // r1 = 0x2