This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

SVE Instruction to get index of first true element

Hello,

in SVE, I can use `svcmpeq_u64` to compare two svuint64_t vectors. The result is a `svbool_t` predicate. Now, I can use `svlastb_` functions to extract the element itself. However, as I am implementing a linear search using SVE, I am not interested in the element itself (I know what I am searching for), but the _index_ of the first true match in the svbool_t returned by the comparison function.

I was not able to find any function that gives me the index of the first true element in a svbool_t. In x86 SSE, for example, I can create a movemask and count trailing zeros to index my comparison results. In NEON, I can do the same by using an emulated movemask function, but I don't know how to do it natively in NEON. How would I do this on SVE?

Thank you so much for your help.

Best,

Maxbit

Parents
  • Dear George,

    Thank you so much again for your help and please excuse my late answer. I am currently implementing several variants that you suggested in order to benchmark them. I have one question on the UMINV approach, maybe I am overseeing something.

    The comparison result vector will be a vector of either all 1-bit or all 0-bit entries. If I element-wise AND this vector with our {1,2,3,..} index vector, the smallest non-zero number that remains is the index we are looking for. However, the AND will turn indices where there was no match into 0, and as 0 will be smaller than any of our other indices, UMINV will always return 0. Or am I getting something wrong here? We would need an instruction for the smallest non-zero value, wouldn't we?

    Thank you for clearing that up.

    Best,

    Maxbit

Reply
  • Dear George,

    Thank you so much again for your help and please excuse my late answer. I am currently implementing several variants that you suggested in order to benchmark them. I have one question on the UMINV approach, maybe I am overseeing something.

    The comparison result vector will be a vector of either all 1-bit or all 0-bit entries. If I element-wise AND this vector with our {1,2,3,..} index vector, the smallest non-zero number that remains is the index we are looking for. However, the AND will turn indices where there was no match into 0, and as 0 will be smaller than any of our other indices, UMINV will always return 0. Or am I getting something wrong here? We would need an instruction for the smallest non-zero value, wouldn't we?

    Thank you for clearing that up.

    Best,

    Maxbit

Children
  • Hi Maxbit,

    You are correct, the UMINV approach doesn't work as I imagined. You could try instead ANDing it with the bitwise negation of the index (which is a large unsigned number) and use UMAXV instead. Something like:

    #include <arm_neon.h>
    #include <stdio.h>
    
    int main() {
      uint32x4_t x = {0, 0, ~0U, ~0U};
      uint32x4_t y = {~0U, ~1U, ~2U, ~3U};
      uint32_t idx = ~vmaxvq_u32(vandq_u32(x, y));
      printf("idx = %u\n", idx);
    }

    Thanks,
    George