This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

SVE Instruction to get index of first true element

Hello,

in SVE, I can use `svcmpeq_u64` to compare two svuint64_t vectors. The result is a `svbool_t` predicate. Now, I can use `svlastb_` functions to extract the element itself. However, as I am implementing a linear search using SVE, I am not interested in the element itself (I know what I am searching for), but the _index_ of the first true match in the svbool_t returned by the comparison function.

I was not able to find any function that gives me the index of the first true element in a svbool_t. In x86 SSE, for example, I can create a movemask and count trailing zeros to index my comparison results. In NEON, I can do the same by using an emulated movemask function, but I don't know how to do it natively in NEON. How would I do this on SVE?

Thank you so much for your help.

Best,

Maxbit

Parents
  • Hi Maxbit,

    You are correct that there is no way to directly extract the index of the first predicated element in SVE, however there are a couple of different approaches you can take to achieve this.

    In particular you can use the `svindex` intrinsic to generate a vector of 0,1,... and then use `svlastb` on that instead, which will give you the index you are after.

    It is also worth pointing out that your question says the first matching element, but `svlastb` will return the last matching element instead. If you can guarantee that there is only a single matching element then this is fine, but if not then you may need to first adjust the predicate such that the last element is the one you are after. You can do this with either `svbrka_b_z` or `svpfirst_b` to ensure that the first matching element of your input predicate is the last one of the resulting predicate that you pass to `svlastb`.

    Putting that all together and you might end up with something like this:

    #include <arm_sve.h>
    #include <stdio.h>
    
    uint64_t find_first_index(svuint64_t x, svuint64_t y) {
      svbool_t ptrue = svptrue_b8();
      svbool_t p = svcmpeq_u64(ptrue, x, y);
      return svlastb_u64(svbrka_b_z(p, p), svindex_u64(0, 1));
    }
    
    int main() {
      uint64_t data1[4] = { 0, 1, 2, 3 }; // assume vl=256, adjust as needed
      uint64_t data2[4] = { 3, 2, 2, 3 };
      svbool_t ptrue = svptrue_b8();
      svuint64_t x = svld1(ptrue, data1);
      svuint64_t y = svld1(ptrue, data2);
      printf("%lu\n", find_first_index(x, y));
    }

    You will need to be careful that you do not try to use this in a situation where none of the elements match, but you can avoid this by simply checking `svptest_any` and continuing your search if none are found.

    Hope that helps,
    George

Reply
  • Hi Maxbit,

    You are correct that there is no way to directly extract the index of the first predicated element in SVE, however there are a couple of different approaches you can take to achieve this.

    In particular you can use the `svindex` intrinsic to generate a vector of 0,1,... and then use `svlastb` on that instead, which will give you the index you are after.

    It is also worth pointing out that your question says the first matching element, but `svlastb` will return the last matching element instead. If you can guarantee that there is only a single matching element then this is fine, but if not then you may need to first adjust the predicate such that the last element is the one you are after. You can do this with either `svbrka_b_z` or `svpfirst_b` to ensure that the first matching element of your input predicate is the last one of the resulting predicate that you pass to `svlastb`.

    Putting that all together and you might end up with something like this:

    #include <arm_sve.h>
    #include <stdio.h>
    
    uint64_t find_first_index(svuint64_t x, svuint64_t y) {
      svbool_t ptrue = svptrue_b8();
      svbool_t p = svcmpeq_u64(ptrue, x, y);
      return svlastb_u64(svbrka_b_z(p, p), svindex_u64(0, 1));
    }
    
    int main() {
      uint64_t data1[4] = { 0, 1, 2, 3 }; // assume vl=256, adjust as needed
      uint64_t data2[4] = { 3, 2, 2, 3 };
      svbool_t ptrue = svptrue_b8();
      svuint64_t x = svld1(ptrue, data1);
      svuint64_t y = svld1(ptrue, data2);
      printf("%lu\n", find_first_index(x, y));
    }

    You will need to be careful that you do not try to use this in a situation where none of the elements match, but you can avoid this by simply checking `svptest_any` and continuing your search if none are found.

    Hope that helps,
    George

Children